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Declaration of Bjorn R. Olsen, Ph.D. Under 37 CFR § 1.132 

I, BJORN R. OLSEN Ph.D., hereby declare as follows: 



molecular biology. I have served as the Chairman for the Program in Cell and 
Developmental Biology at Harvard Medical School, Co-Chairman for the New York 
Academy of Sciences Conference on "Structure, Molecular Biology, and Pathology of 
Collagen", full Professor in the Department of Biochemistry at Rutgers Medical School, 
and President for the International Society for Matrix Biology. In addition, I have also 
chaired numerous conferences, including the Gordon Conference "Structural 
Macromolecules Collagen" and the New York Academy of Sciences Conference on 
"Molecular and Developmental Biology of Cartilage". My achievements in the areas of 
molecular biology and medicine have been recognized by several awards such as the 
MERIT Award for Scientific Excellence (1992 and 1993 awarded by the National 
Institutes of Health), Distinguished Faculty Award (Harvard University) and Honorary 
Doctorate Degrees in Science (University of Medicine and Dentistry of New Jersey and 
University of Oslo, Norway). I have participated on the scientific advisory boards of 
several major biomedical and biopharmaceutical companies such as Metra Biosystems, 
Inc. and OsteoArthritis Sciences, Inc. In addition, I have served on the Board of Directors 
of Organogenesis, Inc. since 1994. At both Rutgers Medical School and Harvard Medical 



1. I am a full Professor in the Department of Cell Biology at Harvard 
Medical School and have conducted and directed research in anatomy, physiology and 
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School, I have participated in numerous committees, including Faculty Council and 
Subcommittee of Professors, and Standing Committee on Student Prizes and Awards in 
the Faculty of Medicine. I have served and continue to serve on various Editorial Boards 
of major scientific journals such as the Journal of Cell Biology, Current Opinions in Cell 
Biology and Journal of Biological Chemistry. My teaching experience encompasses 
many areas of biology and medicine such as anatomy, physiology, histology, gross 
anatomy, biochemistry, cell biology, and developmental biology. 

My major areas of research interests include the role of the extracellular 
matrix in embryonic development, skeletal cell and molecular biology, molecular 
genetics, molecular pathology and the structure, biosynthesis and function of collagens. I 
have published over 270 scientific papers in these and other related areas. A complete 
listing of these papers, publications, theses and/or dissertations accompanies the copy of 
my curriculum vitae, which is attached hereto as Exhibit A. 

2. I have reviewed U.S. Patent Application Serial No. 09/154,302, entitled 
"Methods of Inhibiting Angiogenesis with Endostatin Protein," and understand the 
claimed invention to comprise a method of inhibiting angiogenesis in an individual 
comprising the administration of a protein wherein the protein is a fragment of a C- 
terminal non-collagenous (NCl) region of a collagen protein. 

The Office Action, dated September 28, 2001, rejected applicants' claims 
stating in part that the specification, while being enabling for methods using endostatin 
from collagen XVIII, does not reasonably provide enablement for the broad claims 
pertaining to the use of any inhibitory protein molecule that has an NCI region. 

The Office Action further stated that applicants' specification indicates 
that proteins that share a common or similar N-terminal amino acid sequence with 
endostatin would possess antiangiogenic activity, which according to the Action indicates 
that sequence similarity is needed. 

The Action concluded that although applicants "argued that all NCI 
regions are known to have a helical structure, this structure has not been correlated to 
antiangiogenic activity." 
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Being one skilled in the relevant art, I respectfully disagree with the 
rejections set forth in September 28, 2001 Office Action for the reasons provided below. 

3. Applicants have defined the claimed in* eution according to both structure 
and function, and, based on applicants' description together with the level of knowledge 
in this field at the time the application was filed, one skilled in the art would easily be 
able to identify antiangiogenic proteins derived from collagen proteins. The structural 
feature of importance is the C-terminal noncollagenous region (NCI) of collagen 
proteins. The functional feature of importance is antiangiogenic activity. 

4. As indicated in the specification (page 37, lines 3-15) and known to those 
skilled in the art, the NCI region is a defined structural feature that is present in many 
collagen proteins. (See, for example, the chapters on collagens in the text, GUIDEBOOK 
to the Extracellular Matrix, Anchor, and Adhesion Proteins, second edition, 
Kreis and Vale, eds., Oxford University Press, 1999, pp. 380-408, which were authored 
by Yoshifumi Ninomiya and myself, a copy of which is attached as Exhibit B.) 

The collagens comprise a large family of genetically distinct, but 
structurally related proteins. A prominent and common feature of most of the collagens is 
that they contain both collagenous and noncollagenous regions. Specifically, the 
collagenous regions of collagen molecules consist of the conserved structural feature of 
three polypeptide chains, called alpha chains, that wind around each other in a right- 
handed helix in each molecule to form a characteristic collagen triple helix. The ability of 
collagenous proteins to form structures of high tensile strength is based on the rigid 
structure of collagen molecules. Collagen polypeptides contain one or more blocks of 
Gly-Xaa-Yaa repeats, in which Yaa frequently represents a proline or hydroxyproline 
residue. The presence of such sequence repeats allows groups of three polypeptides to 
fold into triple-helical domains that are rigid and inextensible. The use of such triple- 
helical domains was initially thought to be limited to molecules that make up collagen 
fibrils in tissues, but it is now known that such domains are present in a majority of 
collagen proteins. As further stated by Rehn and Pihlajaniemi, most collagen molecules 
contain noncollagenous sequences at their termini, and several types also have them as 
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interruptions separating adjacent triple helical regions (Proc. Natl. Acad. Sci. USA 
91:4234, 1994, attached as Exhibit C). Accordingly, those skilled in the art easily 
recognize the highly conserved features of collagens consisting of right-handed triple 
helixes as the collagenous regions, and intervening areas and terminal portions as the 
noncollagenous regions. 

5. The Office Action further states that NCI regions are ill-defined, in part 
because the collagen fragments claimed by applicants may potentially comprise non- 
homologous amino acid sequences, and that applicants specification "requires" homology 
among the claimed proteins in order to assure antiangiogenic activity. Being one skilled 
in the art and very familiar with the structure and function of collagen molecules, I 
respectfully disagree with the Examiner's conclusion that homology must be present in 
light of applicants' teachings and description in the specification. Although amino acid 
sequence homology among proteins can be a predictor of similar functions, it is not the 
only aspect that should be considered when making this assessment. Given applicants' 
teaching of NCI domains, their location and methods for identifying antiangiogenic 
activity, together with the level of knowledge of those skilled in the art at the time of 
applicants' priority date, a skilled artisan could reasonably identify potential 
antiangiogenic fragments from C-terminal NCI regions of collagen proteins without 
relying on information concerning amino acid sequences or homology to previously 
identified antiangiogenic fragments. 

As discussed above, the most conserved regions of collagen molecules 
consist of the collagenous regions that are structurally identified by the prominent triple 
helical alpha chains. The areas that appear between the collagenous regions and at the 
terminal portions are identified as noncollagenous regions. Certainly some of the 
noncollagenous regions have homologous amino acid sequences, but given that 
"collagens comprise a large family of genetically distinct, but structurally related 
proteins", it is not surprising that much of the genetic diversity is observed in the 
noncollagenous regions. The lack of homology among the collagen molecules as 
discussed by the Examiner, therefore, actually supports the novelty and patentability of 
applicants' invention. Given the disparity in sequences of certain collagen molecules and 
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in the absence of the teachings in applicants' specification, one would not expect the 
noncollagenous regions of such molecules to have similar activity; more particularly, it 
would not be obvious based on sequence identification that the molecules would have 
comparable effects on angiogenesis. However, despite the disparity in sequences, based 
on the knowledge of collagen molecule structure and function together with applicants' 
teaching regarding NCI region fragments, one skilled in the art could identify potential 
antiangiogenic fragments from collagen proteins. Applicants' novel discovery provides 
that specific domains of molecules, possibly having disparate sequence identities, have 
similar functions. 

6. Applicants have defined the claimed invention in terms of structure (C- 
terminal NCI domains) and function. The common functional feature shared by all 
members of this novel genus is the ability to inhibit angiogenesis. Applicants' ground- 
breaking discovery is that proteins comprising a genus of fragments of the C-terminal 
non-collagenous region of a collagen protein are antiangiogenic and can be used for the 
treatment of angiogenesis mediated diseases including cancer. Page 14, lines 19-23 of 
applicants' specification, defines cancer as angiogenesis-dependent cancers and tumors. 
As stated above, it is known that all members of the collagen family were structurally 
related and had non-collagenous domains at their C-terminal ends. Examples 1-3 teach 
how to isolate antiangiogenic fragments from this region. The specification further 
provides a method for evaluating antiangiogenic activity using assays such as the CAM 
assay (page 40, lines 1-21, of the applicants' specification). Therefore, it is my opinion 
that one of ordinary skill in the art would be able to isolate antiangiogenic fragments from 
any collagen, test for antiangiogenic activity, and detect the presence of endostatin 
proteins with an endostatin protein-specific binding antibody using the teachings of 
applicants' patent application. 

7. Based on the foregoing, I believe that the applicants have sufficiently 
enabled the claimed invention. One skilled in the art could readily identify fragments 
from the C-terminal NCI regions of collagen molecules, assess them for antiangiogenic 
activity according to the methods and techniques provided in applicants' specification. 
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8. I understand that all statements made herein of my own knowledge are 
true and that all statements made on information and belief are believed to be true and 
further that these statements are made with the knowledge that willful false statements the 
like are punishable by fine or imprisonment, or both under 18 U.S.C. § 1001, and that 
such willful false statements may jeopardize the validity of the application or any patent 
issuing thereon. /I 
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Education: 

1 967 Ph.D. University of Oslo, Norway 

1 967 M.D. University of Oslo Medical School, Norway 

Research Fellowships: 

1963-1967 Research Assistant (Histology), Anatomical Institute, University of Oslo, 

Oslo, Norway 

Academic Appointments: 

1 967- 1 97 1 Assistant Professor, Anatomical Institute, University of Oslo, Oslo, Norway 

1971- 1975 Associate Professor, Anatomical Institute, University of Oslo, Oslo, Norway 

1972- 1976 Associate Professor, Department of Biochemistry, CMDNJ-Rutgers Medical 

School, Piscataway, NJ 

1976-1985 Professor, Department of Biochemistry, CMDNJ-Rutgers Medical School, 
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1985-1993 Hersey Professor of Anatomy, Department of Anatomy and Cellular Biology, 

Harvard Medical School, Boston, MA 

1990-1993 Chairman, Program in Cell and Developmental Biology, Harvard Medical 
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1993 " Hersey Professor of Cell Biology, Department of Cell Biology, Harvard 

Medical School, Boston, MA 

1996_ Harvard-Forsyth Professor of Oral Biology, Harvard-Forsyth Department of 

Oral Biology, Harvard School of Dental Medicine, Boston, MA 

1996_ Chairman, Harvard-Forsyth Department of Oral Biology, Harvard School of 

Dental Medicine, Boston, MA 

EXHIBIT 



0 



2 



Other Professional Positions and Major Visiting Appointments: 

1969-1970 Army Service Researcher, Norwegian Defense Research Establishment, 

Division of Toxicology, Kjeller, Norway 

1971-1972 Visiting Scientist, General Clinical Research Center, Philadelphia General 

Hospital, Philadelphia, PA 

Awards and Honors: 

1963 Voss Award for Anatomical Research, University of Oslo Medical School 

1974 A. and K.E. Schreiner Award for Biological Research, Norwegian Academy 

of Sciences 

1 983 Co-chairman, Gordon Conference "Structural Macromolecules Collagen" 

1984 Co-chairman, New York Academy of Sciences Conference on "Biology, 
Chemistry and Pathology of Collagen" 

1 9 85 Honorary A.M. Degree, Harvard University 

1989 Co-chairman, New York Academy of Sciences Conference on "Structure, 
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1 976- 1 980 Member, Molecular Cytology Study Section, NIH 
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1992-1996 Scientific Advisory Board, OsteoArthritis Sciences, Inc., Cambridge, MA 

1994- 2002 Board of Directors, Organogenesis, Inc., Canton, MA 

1995- 1998 National Arthritis and Musculoskeletal and Skin Diseases Advisory Council, 

National Institutes of Health, Bethesda, MD 

2000- Scientific Advisory Board, Prochon Biotech, Ltd., Israel 

2001- Scientific Advisory Board, National Marfan Foundation 
University of Oslo: 

1 967- 1 97 1 Director, Laboratory for Molecular Anatomy, Anatomical Institute 

1 970 Member, University of Tromso Planning Committee 

Rutgers Medical School: 

1975-1978 Committee of Review 

1 978 Acting Chairman, Department of Biochemistry 

1 979- 1 984 Institutional Biosafety Committee 

1 980- 1 98 1 Pathology Chairman Search Committee 

1981- 1 982 Microbiology Chairman Search Committee 
1983-1985 Chairman, Institutional Radiation Safety Committee 
1983-1985 Biotechnology Center Planning Committee 

1985 Research Committee 

Harvard Medical School: 

Subcommittee of Professors 
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Faculty Council 

Chairman' r Ad Hoc Evaluation Committee for Professor of Orthopaedic 
Surgery at Massachusetts General Hospital 

Faculty Council - Docket Committee 

Member - Faculty of Arts and Sciences 

Dunham Lectureship Committee 
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Advisory Board - Child Health Research Center 

M.D.-Ph.D. Admissions Committee 
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and Women's Hospital 
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Ad Hoc Search Committee for Professor of Cellular and Molecular 
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Scientific Advisory Board - Massachusetts General Hospital 

Chairman - Advisory Committee, Graduate Program in Biological and 
Biomedical Sciences 



Editorial Boards: 
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1991-1994 
1998- 



Journal of Cell Biology 



1985-1991 



Developmental Biology, Associate Editor 



1986-1994 



Calcified Tissue International 



1986-1994 



Development 



1989-2002 



Matrix, Matrix Biology, Associate Editor 



2002- Matrix Biology, Editor-in-Chief 

1991- Molecular Biology of the Cell (formerly Cell Regulation) 

1991-1998 Developmental Dynamics, Associate Editor 

1 995- Current Opinion in Cell Biology 

1996- 2001 Journal of Biological Chemistry 
Memberships: 

American Chemical Society 

American Association for the Advancement of Science 
American Association of University Professors 
The American Society for Cell Biology 
The American Society of Biological Chemists 
American Association of Anatomists 
East Coast Connective Tissue Society 
International Society for Matrix Biology 
Major Research Interests: 

1. The role of extracellular matrix in embryonic development, particularly skeletal and vascular 
morphogenesis 

2. Skeletal cell and molecular biology, molecular genetics and molecular pathology 

3. Structure, biosynthesis and function of collagens 

Teaching Experience: 

1963-1967 Anatomy and physiology courses, Norwegian State College of Nursing, Oslo 

1 967- 1 97 1 Histology courses, University of Oslo Medical School 

1972-1973 Biochemistry course for first year medical students, Rutgers Medical School 

1974- 1975 Histology and gross anatomy courses for first year medical students, 

University of Oslo Medical School 

1975- 1985 Biochemistry course for first year medical students, Rutgers Medical School 

Course in recombinant DNA techniques and participation in course in electron 
microscopy techniques, for graduate students at Rutgers Medical School 
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1985_ Lecturer, graduate student courses in Cell Biology and Developmental 
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1986 Tutor, New Pathway course "The Body" for first year medical students, 
Harvard Medical School 

1 988- 1 990 Senior Tutor, Cannon Society, Harvard Medical School 
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1 978-79, Best Lecturer of the First Year Class, Student Association, 
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Collagens 



The collagens constitute a superfamily of extracellular 
matrix proteins with a structural role as their primary 
function. Based on the exon structure of their genes as 
well as the configuration of the sequence domains of the 
proteins, they can be divided into several families or 
groups. Within each family, several homologous genes 
encode polypeptides that have domains with similar 
sequences. All collagenous proteins have domains with a 
triple-helical conformation. Such domains are formed by 
three subunits (a chains), each containing a (Gly-X-Y) n 
repetitive sequence motif. 

The presence of domains with a triple-helical molecular 
conformation (Fig. 1) provides collagens with regions of 
rigid, rod-like molecular structures. 1 - 2 In fibrillar collagens 
and short chain collagens each collagen molecule (after 
complete proteolytic removal of amino and carboxyl 
propeptides) contains only one such domain which 
accounts for almost the entire length of the molecule. In 
other collagens, such as FACIT collagens, basement mem- 
brane collagens, multiplexins, and collagens with trans- 
membrane domains - MACITs, several short triple-helical 
domains are separated by non-triple-helical sequences. 

Within triple-helical domains, each a chain is coiled 
into an extended left-handed polyproline II helix and 
three a chains are in turn twisted into a right-handed 
superheiix. The high resolution crystal structure of a 
triple-helical collagen-like peptide shows that the triple 
helix is surrounded by a cylinder of hydration; an exten- 
sive network of hydrogen bonds between water mole- 
cules and peptide acceptor groups stabilizes the 
structure. 2 Residues of 4-hydroxyproline in the Y-position 
of the (Gly-X-Y) n repeat sequence play a critical role in 
the hydrogen-bonded structure. The post-translational 
hydroxylation of collagen polypeptides therefore causes a 
significant increase in the thermal denaturation (melting) 
temperature of collagen triple helices. The triple-helical 
conformation requires a close packing of every third 
residue in each a chain along the triple-helical axis and 




Figure 1. Molecular structure of the triple-helical 
conformation; three left-handed helices form a right- 
handed superheiix. 
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only glycyl residues can be accommodated in this posi- 
tion. This explains why collagen mutations in which such 
triple-helical glycyl residues are replaced by residues with 
more bulky side chains can cause severe abnormalities. 
Even replacement with an alanine residue can result in a 
local untwisting of the triple helix and an alteration in 
the characteristic hydrogen bonding pattern. 2 

Collagenous proteins usually form supramolecular 
aggregates (fibrils, filaments, or networks), either alone 
or in conjunction with other extracellular matrix compo- 
nents. Their major function is to contribute to the struc- 
tural integrity of the extracellular matrix, or to help 
anchor cells to the matrix. Some of the non-fibrillar colla- 
gen types appear also to have important regulatory 
functions. 

Based on detailed analyses of the exon structures of 
genes that encode collagenous proteins, a comparison of 
protein domains, and functional considerations, the colla- 
gen superfamily can be divided into several subfamilies, 
as follows. 

1. Fibrillar collagens. This group includes types I, II, III, V, 
and XI collagen, with moieties forming banded 
(cross-striated) fibrils in various tissues. 

2. FACT collagens. These include types IX, XII, XIV, XVI, 
and XIX collagens, with molecules that are associated 
with fibrils formed by fibrillar collagens. 

3. Short chain collagens. These include types VIII and X 
collagen, with short, dumb-bell shaped molecules that 
form part of unique networks in basement membrane 
regions (type VIII) and hypertrophic cartilage (type X). 

4. Basement membrane collagens. These include several 
different molecules collectively known as type IV colla- 
gens. They represent the major collagenous compo- 
nents of basement membranes. 

5. Multiplexins. These are molecules with multiple short 
triple-helical domains that are found mostly in base- 
ment membrane regions. Types XV and XVIII collagen 
currently belong to this group. 

6. Collagens with transmembrane domains - MACITs. 
Types XIII and XVII collagen are cell-surface molecules 
with multiple extracellular triple-helical domains, con- 
nected to a cytoplasmic region by a transmembrane 
segment. Their orientation (with the carboxyl end in 
the extracellular space) is similar to that of other cell 
surface molecules with triple-helical domains, such as 
the type I macrophag e scavenger receptor, 3 po ssibly 
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the B-chain of the C1q complex, 4 - 5 and the 
macrophage protein MARCO. 6 The last three mole- 
cules have not traditionally been included among the 
collagens, but given their triple-helical domains 7 they 
can, with good justification, be described as collagen 
types. In fact, several proteins with triple-helical 
domains are not included among the collagens, most 
likely because they were not discovered in a 'collagen 
laboratory'. These include the collectins 8 - 9 (lung surfac- 
tant protein A, lung surfactant protein D, bovine ag- 
glutinin, collectin-43, and mannose binding protein), 
ficolins, 1011 hibernation proteins, 12 the asymmetric 
form of acetylcholinesterase, 13,14 the subunits of C1q, 15 
and a component of the inner ear. 16 



7. Other collagens. This group includes molecules (types 
VI and VII) that form specialized structures in a variety 
of tissues (e.g. microfibrils for type VI, anchoring fibrils 
for type VII). 

Most collagen genes have now been cloned and their 
chromosomal locations determined. The recently 
described collagen types were found through molecular 
cloning; additional collagenous molecules are likely to be 
discovered through analyses of ESTs and genome 
sequences. The very large number of sequence entries in 
the GenBank/EMBL data bank makes it impractical to list 
the appropriate accession numbers in the pages that 
follow. The reader is instead referred to original articles 
for specific information. The chromosomal locations of 
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the human genes are given ratable 1. Useful databases 
for all collagens are Online Mendelian Inheritance in Man 
(OMIM) which can be accessed at http7Awww3.ncbi.nlm. 
nih.gov/Omim/ and allied resources and links. 
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Collagen types I, II, III, V, and XI participate in the forma- 
tion of fibrils with molecules packed in quarter-staggered 
arrays. Encoded by homologous, multiexon genes, they 
evolved to provide multicellular organisms, from sponges 
to humans, with supramolecular scaffolds for mechanical 
support and the proper environment for cellular migra- 
tion, attachment and differentiation. Fibrillar collagens 
are synthesized as precursors, procollagens, that are pro- 
teolytically processed to collagen in the extracellular 
space. 

Fibrillar collagens include five different molecular types 
(I, II, III, V, and XI) containing polypeptide subunits (a 
chains), encoded by nine distinct genes. 1 " 6 The molecules 
are either homotrimers with a chains of the same kind 
(types II and III) or heterotrimers' with two or three differ- 
ent a chains (types l f V, and XI). Some a chains participate 
in the formation of more than one collagen type. For 
example, the product of the COL2A1 gene, a1(ll), forms 
homotrimeric type II collagen molecules and participates, 
as a3(XI), in the formation of heterotrimeric type XI colla- 
gen molecules. Also, a1 (XI) chains appear both in het- 
erotrimeric type XI molecules in cartilage as well as in a 
bone variant of type V collagen, where it replaces the 
a1(V) chain. 7 Finally, a2(V) chains are found both in het- 
erotrimeric type V molecules and in the vitreous form of 
type XL 8 Because of their great similarity and ability to 
form mixed heterotrimers, types V and XI collagen are 
frequently referred to as type V/XI collagen. 9 

Each fibrillar collagen a chain contains over 300 repeats 
of the triplet sequence -Gly-X-Y- f flanked by short non- 
triplet-containing sequences, telopeptides, at each end. 
About 50 per cent of the prolyl residues in the Y positions 
of the triplet domain are post-translationally converted 
to 4-hydroxyproline by the enzyme prolyl 4-hydroxylase 
(EC 1.14.11.2). located in the rough endoplasmic reticu- 
lum. 10 The active enzyme is a tetramer of two non-identi- 
cal subunits, a and /3. The 0 subunit is the enzyme protein 
disulphide isomerase, and the tetramer prolyl 4-hydroxy- 
lase has disulphide isomerase activity. 11 In addition to 
prolyl hydroxylation, some lysyl residues in the Y position 
are hydroxylated by lysyl hydroxylase, 12 and the sequen- 
tial action of galactosyl hydroxylysyl transferase and glu- 
cosylgalactosyl hydroxylysyl transferase adds mono- and 
disaccharides to some hydroxylysyl residues. 13 



The a chains of fibrillar collagens are synthesized as 
pro-a chains, with amino (N) and carboxyl (C) propeptides 
flanking the central (Gly-X-Y) n -containing domain (Fig. 
1). Folding of a trimeric C-propeptide domain is the first 
step in intracellular assembly of homo- or heterotrimeric 
procollagen molecules; the chain composition of collagen 
molecules is therefore determined by the specificity by 
which C-propeptides of various procollagens interact. The 
folding of the triple-helical domain proceeds from the 
carboxyl end towards the amino end of the trimeric mol- 
ecule in a zipper-like fashion and with a rate that is 
limited by cis-trans isomerization of peptidyl prolyl 
bonds. 4 Since prolyl and lysyl hydroxylases do not work 
on triple-helical substrates, 10 - 12 and the thermal stability 
of the triple helix depends on the level of hydroxylation 
of prolyl residues, the folding of the triple helix limits the 
degree of post-translational hydroxylation to what is 
needed for a stable triple helix at 37°C S Substitutions of 
glycine residues in the Gly-X-Y repeats lower the stabil- 
ity, may lead to overmodification, and a decrease in the 
rate of secretion from cells resulting in intracellular reten- 
tion and degradation. 14 

During extracellular processing of procollagen to colla- 
gen, the propeptides are removed from the major colla- 
gen triple-helical domain by specific endoproteinases. 13 
Of great interest is the recent finding that the C pro- 
teinase is identical to BMP-1, the mammalian homologue 
of the Drosophila tolloid gene product. 15 In Drosophila 
the tolloid protease is involved in processing the precur- 
sor of the BMP-2/4-like product of the decapentaplegic 
(dpp) gene; BMP-1 may likewise activate latent forms of 
BMPs or other members of the TGF-/3 family of molecules. 
Controlled cleavage of the N-propeptide domain plays a 
role in fibrillogenesis and is thought to be important for 
regulation of fibril diameters. 416 ' 17 



■ Purification and recombinant synthesis 

To purify fibrillar collagens from tissues, pepsin has been 
commonly used to dissociate collagen triple-helical 
domains (these are pepsin resistant) from other extracel- 
lular matrix molecules. Repeated differential salt precipi- 
tations at neutral pH as well as in acid conditions are 
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Triple- Helical Domain 

Figure 1. Diagram of the domains of a fibrillar procollagen molecule (type I). (From Olsen 1991. 13 ) 



then used to purify each fibrillar collagen type. 18 Types II, 
V, and XI collagen require higher salt concentrations than 
types 111 and I to precipitate. 19 Fibrillar procollagen mole- 
cules have been purified from media of cultured cells. 20 ' 21 
Addition of protease inhibitors and avoidance of acidic 
pH prevent the action of endogenous proteolytic 
enzymes that remove the propeptides, resulting in the 
isolation of intact precursor molecules. Purified collagens 
are available from a number of commercial sources. A 
good listing of many suppliers is provided in the 
BioSupplyNet Source Book, available at http://www.bio- 
supplynet.com. Recently, recombinant fibrillar procolla- 
gens have been synthesized in mammalian cells as well 
as in insect cells transfected with genes encoding the 
a and p subunits of prolyl 4-hydroxylase to ensure 
proper hydroxylation of prolyl residues. 22 " 25 A fragment 
of human type III collagen has been produced in 
Saccharomyces cerevisiae. 72 

■Antibodies 

Polyclonal and monoclonal antibodies against all fibrillar 
collagens from a number of animal species are avail- 
able. Some of these antibodies are directed against epi- 
topes in the propeptide domains of the procollagens; 20 
others are directed against epitopes in the triple-helical 
domain. 2627 Several monoclonal antibodies have been 
used for epitope mapping in conjunction with rotary 
shadowing and electron microscopy. 27 - 28 A variety of 
antibodies are available from commercial sources; a 
good listing of suppliers can be found in the 
BioSupplyNet Source Book. 



■Activities 

The triple-helical products of procollagen processing, 
fibrillar collagens, polymerize to form fibrils that serve as 
stabilizing scaffolds in extracellular matrices. 29 Within the 
fibrils, the 300 nm long rod-like molecules overlap with 
their ends by about 30 nm and are arranged in quarter- 
staggered arrays. The fibrils therefore have a periodic 
structure. Each period is 67 nm long and consists of a 
'hole' zone with more loosely packed molecules and an 
overlap zone with more densely packed molecules. These 
zones can easily be visualized by negative staining and 
electron microscopy. When fibrils are positively stained, a 
periodic cross-striation pattern is observed, reflecting the 
distribution of clusters of charged amino acid residues 
along the collagen molecules. 29 Cell differentiation and 
migration during development are influenced by fibrillar 
collagens, and collagens interact with cells through inte- 
grin receptors on cell surfaces. 

Collagen fibrils usually contain more than one type of 
collagen, 30 and such heterotypic fibrils are arranged in 
different patterns in different tissues; parallel fibril 
bundles in tendon, criss-crossing layers in cornea, and 
spiral arrangements in lamellar bone (Fig. 2). Hetero- 
typic fibrils containing types I, III, and/or V collagens are 
expressed in a number of tissues of mesenchymal origin 
such as skin, tendon, ligaments, and bone, whereas 
fibrils with types II and XI are found predominantly in 
hyaline cartilage and the vitreous body of the eye. It 
is believed that the presence of small amounts of 
collagens V and XI within the fibrils limits fibril diame- 
ters due to steric hindrance, based on the incomplete 
removal of N-propeptides from types V and XI mole- 
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Figure 2. Collagen fibrils in 14-day chick embryo tendon (A), sternal cartilage (B), dermis (C), and corneal stroma 
(D). (Courtesy of Dr David Birk.) 



cules. 17 - 30 - 31 Thus, the diameter of heterotypic collagen 
fibrils depends on the ratio between collagens V and I (or 
XI and II); the higher the ratio, the thinner the fibrils 
(Plate 11). Fibril properties are also dependent on inter- 
actions with FACIT collagens and small proteoglycans 
(decorin, biglycan, fibromodulin). 4 ' 32-39 The ability of the 
C propeptide to serve as a ligand for the integrin a201 
may play a role in regulating fibrillogenesis at the cell 
surface. 40 - 41 Binding of fibrillar collagen to fibronectin 
may play a role in assembly of fibronectin fibres. 42 



■ Genes 

The chicken a2(l) collagen gene was the first fibrillar col- 
lagen gene to be isolated and completely character- 
ized. 4344 Since then, cDNAs and genomic clones have 
been isolated for almost all fibrillar collagen genes, from 
a number of species. The number and the size of exons 
are similar in the various fibrillar collagen genes. 2 - 3 - 45 It is 
likely that many of the triple-helical domain exons 
evolved by repeated duplications of an exon unit of 54 
base pairs. Alternative splicing generates transcripts 
encoding fibrillar procollagens with different N-propep- 
tide domains in the COL2A1, COL11A1, and COL11A2 
genes. 46-52 Since the number of entries in the GenBank/ 
EMBL data bank is very large, readers are referred to 



original publications for accession numbers of specific 
sequences. 

■ Mutant phenotypes/disease states 

Mutations in mice 

Homozygous Mov13 mice, carrying an insertion 6f provi- 
ral sequences in a transcriptional enhancer within the first 
intron of Collal, are developmental^ arrested between 
days 1 1 and 12 of gestation due to a block of transcription 
of the gene in fibroblasts. 53 Heterozygous Mov13 animals 
survive to adulthood and serve as models for the mild 
dominant form of osteogenesis imperfecta. 54 A frame-shift 
mutation in the C-propeptide coding domain of the Col1a2 
gene in the oim mouse also results in an osteogenesis 
imperfecta-like phenotype. 55 Disproportionate micromelia 
(Dmm) in mice is caused by a three-nucleotide deletion in 
the C-propeptide coding region of Co/23 7, 56 while autoso- 
mal recessive chondrodysplasia {cho) is caused by a frame- 
shift mutation in Co/7 7 a 7 leading to loss of synthesis of 
a1(XI) collagen chains. 57 

Transgenic mice expressing dominant-negative mutant 
constructs have been generated for Col1a1, Cql2a1 t and 
Col5a2. 5a-63 Mice with overexpression of wild-type gene 
or carrying inactivated ('knock out') alleles have also 
been described for several fibrillar collagen genes. 64 " 66 
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■ Human diseases 

Mutations in COL1A1 and COL1A2, the genes encoding 
the a chains of type I procollagen, account for the major- 
ity of cases with osteogenesis imperfecta and for certain 
types of the Ehlers-Danlos syndrome. 14,67 " 68 Mutations in 
COL3A1 cause Ehlers-Danlos syndrome type III and type IV. 
Mutations in COL5A1 have been described in patients with 
Ehlers-Danlos syndrome type I and type II. 69 Mutations in 
COL2A1 cause a spectrum of chondrodysplasias, including 
achondrogenesis II, hypochondrogenesis, spondyloepiphy- 
seal dysplasia, and Kniest and Stickler syndromes. 14 



■ Websites 

A mutation database for the COL1A1, C0L1A2, and 
COL3A1 genes has been described 70 and is accessible at 
http^/www.le.ac.uk/genetics/collagen/collagen.html. A 
review of almost 300 mutations in COL1A1, COL2A1, and 
COL3A1, as well as other collagen genes has been pub- 
lished. 71 The OMIM database ;(www3.ncbi.nlm.nih. 
gov/Omim/) is an excellent resource for all fibrillar colla- 
gens and their associated inherited diseases. 
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FACIT coilagens are a group of proteins that may serve as 
molecular bridges between fibrillar coilagens and other 
extracellular matrix components. Their structure is strik- 
ingly different from that of other coilagens in that their 
molecules contain two, three, or more relatively short 
triple-helical domains connected by non-triple-helical 
sequences. For some FACIT coilagens, utilization of 
alternative promoters and alternative splicing give rise 
to different transcripts that are expressed in tissue- 
specific and time-dependent manners during embryonic 
development. 

The FACIT (fibril associated coilagens with interrupted 
triple helices) group of coilagens includes at least five 
types of molecules, IX XII, XIV, XVI, and XIX, composed 
of seven distinct polypeptide chains. 1 - 2 cDNA dones 
encoding an additional chain are currently being charac- 



terized (M. Gordon; personal communication) so it is 
likely that the group will prove to contain even more 
types of coilagens. The domain structure of the first mol- 
ecule of this group to be described, type IX collagen, was 
predicted by cloning and sequencing of a cDNA encoding 
the chicken a1(IX) chain. 3 Type IX molecules are het- 
erotrimers consisting of a1(IX), a2(IX), and a3(IX) 
chains. 4 - 5 As shown in Fig. 1, they contain three triple- 
helical (COL) domains interrupted by non-triple-helical 
(NC) regions. Most of the NC domains contain cysteinyl 
residues forming disulphide bridges between subunits. 
One of the subunits, a2(IX), serves as a proteoglycan core 
protein and contains a glycosaminoglycan side chain 
attached to a seryl residue in the NC3 domain; 6 " 10 Before 
the structure of collagen IX was established by cDNA 
cloning/sequencing, it was, in fact, isolated as a proteo- 
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Figure 1. Diagram of type II collagen-containing fibrils with type IX collagen molecules on the surface. (From 
Jacenko et aL 1991. 83 ) 



glycan called PG-Lt from chicken cartilage. 1112 In carti- 
lage, the giycosylation of the a2(IX) chain is incomplete 
and the glycosaminoglycan side chain is relatively 
short. 13 - 14 In the chicken vitreous body of the eye, 
however, the side chain is much longer and here type IX 
collagen may function primarily as a proteoglycan core 
protein. 15 ' 16 

Type IX collagen molecules, expressed in hyaline carti- 
lage, are associated with the surface of collagen fibrils 
such that two of the triple-helical domains are located at 
or close to the fibril surface, while an N-terminal globular 
domain is located in the perifibrillar space at the tip of a 
triple-helical arm (Fig. 2). 17 The type IX collagen mole- 
cules have an antiparallel orientation relative to the col- 
lagen II molecules within the fibrils. This has been 
deduced from the positions of covalent, hydroxypyri- 
dinium cross-links between the two types of mole- 
cules. 18 - 19 There are also covalent cross-links between 
different type IX collagen molecules, suggesting that type 
IX molecules on the surface of one fibril may be cross- 
linked to molecules on the surface of another fibril at 
points of intersection. 19 - 20 Immunofluorescence, in situ 
hybridization, and biochemical studies have shown that 
this FACIT collagen is also present in embryonic chick 
cornea and in the vitreous body. 21 Different tissues 
contain different forms of type IX molecules. These forms 
are translation products of two distinct mRNAs generated 
by alternative transcription of the al(IX) collagen gene. 22 
In chondrocytes, the majority of the al(IX) transcripts are 
synthesized from an upstream transcription start site 
leading to the formation of mRNA that codes for a 
polypeptide with an N-terminal, globular domain. 23-25 In 
embryonic chick cornea, 22 * 26 - 27 the vitreous body, 15 - 1627 



neural retina, 28 - 29 perinotochordal matrix, 30 - 31 and early 
limb buds, 32 the majority of the transcripts are synthe- 
sized from a downstream, alternative start site, leading 
to the formation of mRNA encoding an a1(IX) with an 
alternative signal peptide sequence and lacking the N- 
terminal globular domain. 

Types XII and XIV collagen are homologous, but dis- 
tinct homotrimeric molecules. 33-41 Their subunits contain 
two triple-helical (COL) domains separated by an NC 
region of more than 40 amino acid residues, a relatively 
short (<100 amino acid residues) non-triple-helical C- 
terminal region, and a very large (>1500 amino acid 
residues) N-terminal non-triple-helical domain 2 (Fig. 3). 
Within the native molecules the COL domains of the 
three subunits form a triple-helical tail attached to a 
central globule from which three non-triple-helical arms 
or finger-like structures project. 40 " 42 The central globule 
and the arms are composed of the N-terminal NC 
domains. For both types XII and XIV collagen, alternative 
splicing of primary transcripts generates molecular diver- 
sity. Two major molecular forms of type XII collagen 
differ in the lengths of the N-terminal NC (NC3) 
domains. 2 In form XIIA, the NC3 domains contain 18 
fibronectin type III repeats and four von Willebrand 
factor A-like domains. 34 In the shorter form XIIB there are 
10 fibronectin type III repeats and two von Willebrand 
factor A-like domains 35 (Fig. 3). Both forms contain identi- 
cal signal peptides and are encoded by mRNAs with iden- 
tical 5' untranslated sequences. For form XIIB, two 
variants generated by alternative splicing at the 3' end of 
the primary transcript have been described. 43 In variant 
XIIB 1 the carboxyl non-triple-helical domain NC1 is 74 
amino acid residues long and contains an acidic region 
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followed by a basic region. In variant XIIB2 the NC1 
domain is much shorter (only 19 residues) and lacks these 
features. 43 

In type XIV collagen the NC3 domain is somewhat 
smaller than in collagen XIIB. Splice variations affecting 
the 5' untranslated region of type XIV collagen mRNA, 
the N-terminal fibronectin type III repeat as well as the 3' 
region have been described. 3 *- 39 * 44 Undulin, a protein iso- 
lated from human placenta, has been shown to be 
encoded by the type XIV collagen gene and represents 
one of these variants. 44 - 45 The initially isolated undulin 
was a protein composed of only von Willebrand factor A- 
like domains and fibronectin type III repeats and no colla- 
gen sequences, but this was probably caused by 
proteolysis during isolation of the protein. 46 

Types XII and XIV collagen are found in most dense 
connective tissues. There is considerable overlap between 
their tissue distributions, but there are also some differ- 
ences. 47 * 49 In bovine skin, type XII collagen is particularly 
concentrated in the papillary dermis, while type XIV colla- 
gen is present in the reticular dermis. 40 In periosteum, 
type XIV collagen appears restricted to the outer fibrous 
layer while type XII collagen is expressed both in this 
layer and in the innermost layer of osteogenic cells. 50 
Antibodies to both collagens show labelling along type I- 
containing fibrils. 51 

The structure of types XVI and XIX collagens has 
been deduced from cDNA sequences. Both molecules 
contain multiple triple-helical and non-triple-helical 
domains. Type XVI collagen contains 10 triple-helical 
domains interspersed with 11 short non-triple- 
helical sequences 52 * 53 (Fig. 3). The C-terminal triple-helical 
domain (COL1) shows structural homology with the COL1 
domains of types IX, XII, and XIV collagens. The non- 
triple-helical domains contain multiple cysteine residues 
that are arranged in a pattern similar to that found in 
cuticle collagens of C elegans. 54 Type XVI collagen cDNAs 
were initially isolated from human fibroblast 52 and pla- 
cental cDNA libraries, 53 but subsequent studies have 
shown a wide range of expressing tissues. 55 Type XIX col- 
lagen was originally discovered through cDNA cloning 
with RNA from the human rhabdomyosarcoma cell line 
RD (CCL136). 56 The predicted polypeptide was found to 
contain five triple-helical (COL) domains, interspersed 
with and flanked by six non-triple-helical (NO domains 
(Fig. 3). The coding region is relatively small in compari- 
son to the size of the transcript due to a long 3'-UTR 
(5 kb). The al(XIX) gene is located on human chromo- 
some 6q12-q13, syntonic to the a1(IX) and a1(XII) 
genes. 57 In mouse embryos the a1(XIX) gene is tran- 
scribed in many organs but only a few adult tissues such 
as brain, eye and testis appear to express this collagen. 56 



Figure 2. Rotary shadowing micrograph of type II 
collagen fibril with type IX molecules on the surface. 
(From Vaughan et aL 1988. 17 ) 
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Figure 5. Diagram showing the domain structures of 
the members of the FACIT group of proteins. The 
domains are counted from the C terminus. 
Non-triple-helical domains are shown as open 
rectangles; triple-helical domains as solid lines. 

■ Purification and recombinant synthesis 

Type IX collagen can be purified from the medium of chon- 
drocyte cultures or from cartilage tissue extracts. 59 Triple- 
helical fragments of the molecule have been purified from 
pepsin extracts of cartilage. 60 Types XII and XIV collagens 
have been purified from neutral salt extracts of skin and 
tendons 40 and as triple-helical fragments by pepsin extrac- 
tion. 36 Type XVI collagen has not been purified as a protein 
from tissues, but a 160-210 kDa protein was detected by 
polyclonal antibodies in Western blots, consistent with the 
predicted structure from cDNA. 53 * 55 A recombinant a1(XIX) 
peptide was produced in £ co//. 58 

■ Antibodies 

A large number of antibodies against FACIT collagens are 
available. Polyclonal antibodies against synthetic pep- 
tides deduced from nucleotide sequences 7 and poly- 
clonal 20 as well as monoclonal antibodies 61 " 63 against 
protein fragments have been described for type IX colla- 
gen. A monoclonal antibody against a synthetic peptide 
derived from cDNA sequences recognizes the chicken 
a1(XII) chain by Western blotting, and has been used for 
immunohistochemical studies. 47 Monoclonal antibodies 
against bovine type XII (TL-A) and type XIV (TL-B) colla- 
gen are also available. 40 - 51 Polyclonal antibodies have 
been made against a synthetic peptide and a recombi- 
nant fragment of type XVI collagen. 55 

Polyclonal antibody against a recombinant a1(XIX) 
peptide was raised and used for Western blotting. 58 

■Activities 

Type IX collagen molecules are arranged in a periodic 
fashion along heterotypic collagen ll/XI fibrils. 17 Covalent 



lysine-derivea hydroxypyridinium cross-links between 
types IX and II molecules, as well as between collagen IX 
molecules, stabilize the fibril association. 1819 This 
arrangement suggests that type IX collagen may serve as 
a molecular bridge between fibrils as well as between 
fibrils and other extracellular matrix constituents. Based 
on the colocaiization of types XII and XIV with collagen I- 
containing fibrils in tissues, and the partial sequence simi- 
larity between types IX, XII, and XIV collagen, it is 
thought that types XII and XIV collagen associate with 
type I collagen fibrils in a similar fashion to type IX with 
type II fibrils.* 64 Although type XIV collagen molecules 
did not bind to type I collagen in experiments with iso- 
lated matrix molecules, 65 it has been shown that type XII 
collagen can become incorporated into type I collagen 
fibrils when it is present during fibril formation; removal 
of the triple-helical domains of type XII reduced its ability 
to polymerize with type I collagen. 66 Both types XII and 
XIV collagen can bind to the dermatan sulphate chains of 
the fibril-associated proteoglycan decorin. 65 ' 67 The two 
FACIT molecules may therefore interact with fibrils both 
directly and indirectly and may be important in keeping 
fibrils together in bundles, or alternatively, in preventing 
fibril fusion during tissue morphogenesis. Addition of 
types XII and XIV collagen to gels of type I collagen pro- 
moted fibroblast-induced gel contraction. 68 The effect 
was lost upon denaturation of the proteins, but was not 
reduced when the triple-helical domains were digested 
with bacterial collagenase. Since type XIIA carry gly- 
cosaminoglycan side chains while XIIB does not, 41 it is 
possible that cells can regulate the hydrophilic properties 
of perifibrillar compartments by controlling the expres- 
sion of the two major splice variants of type XII collagen. 

■Genes 

cDNA sequences are available for chicken a1(IX), a2(IX), 
and a3(IX), 3 ' 69 - 71 mouse a1(IX), a2(\X) t ^ and human 
«1(IX), a2(IX) and a3(IX). 25 - 74 - 75 Partial cDNA sequences 
from rat, 76 bovine, 69 and dog (GenBank L77390) al(IX) 
are also reported. Genomic clones and sequences are 
available for the chicken a1(IX) and a2(IX) genes 2369 - 77 as 
well as the mouse and human a1(IX) and cr2(IX) 
genes. 9 ' 24 ' 73 ' 75 * 78 Genomic and cDNA sequences are also 
available for a1(XII), a1(XIV), a1(XVI) and a1(XIX) colla- 
gens from several species. 33 ' 34 ' 38 * 39 - 52 * 53,56 " 58 

■Mutant phenotype/disease states 

Evidence for a role of type IX collagen in maintaining 
long-term stability of cartilage comes from studies of 
transgenic mice and genetic abnormalities in humans. In 
transgenic mice expressing an a1(IX) transgene with an 
in-frame deletion in the central triple-helical domain 
(COL2) 80 and in homozygous mice with inactivated a1(IX) 
alleles, 78 articular cartilage developed degenerative 
changes resembling those of human osteoarthritis. In 
humans, demonstration of linkage between the COL9A2 
locus and multiple epiphyseal dysplasia 2 (EDM2) (OMIM 
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600204) 81 was followed by confirmation of linkage in a 
second family with EDM2 and identification of a splice 
site mutation in COL9A2 causing exon skipping and dele- 
tion of 12 amino acid residues in the COL3 domain of the 
a2(IX) collagen chain. 82 Affected individuals in the two 
families develop stiffness and pain in knees during child- 
hood and adolescence. X-rays of knees reveal flattened, 
irregular epiphyses and gradually appearing osteoarthri- 
tis. Because of the heterotrimeric structure of type IX col- 
lagen molecules, one would expect mutations in COL9A1 
and COL9A3 to also cause multiple epiphyseal dysplasia 
with early onset osteoarthritis. 
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Short chain collagens 



Types VIII and X collagen, composed of the three chains 
a1(VIII), a2(VIII), and a1(X), form the subgroup of short 
chain collagens, so named because their subunits are 
short (only about 60 kDa) as compared with fibrillar colla- 
gen chains. Despite- similarities in domain structure, 
amino acid sequences, and genomic exon configurations, 
the two types show very different temporal and spatial 
expression. Given the similarity in exon structure it is 
likely that the three genes evolved by duplication of a 
common precursor gene. 



Type VIII collagen was originally identified as a product 
of bovine aortic and rabbit corneal endothelial cells, but 
is also synthesized by non-vascular cells. 1 The molecule is 
probably a heterotrimer composed of a1(VIII) and a2(VIII) 
chains in a ratio of two to one, 2 but the existence of 
homotrimeric molecules composed entirely of a1(VIII) or 
a2(VIII) chains cannot be ruled out. 3 

Type X collagen is a specific product of hypertrophic 
chondrocytes and is a useful marker for chondrocyte mat- 
uration to hypertrophy. 4 Except for the avian eggshell, s it 
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does not appear to be expressed in other tissues outside 
hypertrophic cartilage. The molecule is a homotrimer of 
a1(X) chains and has a domain structure that is similar to 
that of type VIII collagen; a central triple-helical (COL1) 
domain of 50 kDa is flanked by N-terminal (NC2) and C- 
terminal (NC1) non-triple-helical domains. 6 Both type VIII 
and type X molecules appear as 130 nm long rods with 
knobs at both ends by electron microscopy after rotary 
shadowing. 5 - 7 The COL1 and NC1 domains of both types 
are encoded by one large exon, whereas the NC2 domain 
is encoded by a small additional exon. 7 " 9 Additional exons 
(one for a1(X) or two for a1(VIII)) encode the 5' untrans- 
lated portion of the mRNA. This exon configuration is in 
stark contrast to the multiexon structure of most other 
collagen genes. 

Despite the similarities, a distinct tissue distribution has 
been found for the two short chain collagens: type X is 
restricted to hypertrophic cartilage/ whereas type VIII is 
distributed in various tissues including Descemet's mem- 
brane, vascular subendothelial matrices, heart, liver, 
kidney, perichondrium, and lung, as well as several malig- 
nant tumours including astrocytoma, Ewing's sarcoma, 
and hepatocellular carcinoma. 1 * 10,11 

In Descemet's membrane, type VIII collagen molecules 
represent major components of a hexagonal lattice struc- 
ture, 12 with type VIII molecules most probably linked by 
interactions involving the non-triple-helical end regions 
(Fig. 1). Type X collagen molecules may form the same 
kind of polymer in hypertrophic cartilage, 13 - 14 and colo- 
calization with a proteoglycan epitope 15 suggests a 
complex with proteoglycans. The expression of type X 
collagen is regulated primarily at a transcriptional 




Figure 1. Hexagonal network structures formed in 
cultures of bovine corneal endothelial cells. The back- 
bone of the network is composed of type VIII collagen 
molecules. (From Sawada, H. et al. 1984. 46 ) 



level. 16 - 17 Evidence from 'knockout' and overexpression 
studies suggests that chondrocyte hypertrophy and type 
X collagen expression is negatively regulated by PTHrP 
and its receptor in growth plates. 18 

■Purification and recombinant synthesis 

The triple-helical domain of type VIII collagen can be 
purified from Descemet's membrane by pepsin extrac- 
tion. 1 The digested material can be precipitated with 
NaCI at neutral pH, and purified further by chromatogra- 
phy through agarose and by reverse phase HPLC. 1 Intact 
type VIII collagen can be recovered from the medium of 
cultured endothelial cells. 1 Type X collagen can be iso- 
lated intact from the medium of chicken hypertrophic 
chondrocytes kept in long-term culture or as a triple- 
helical fragment by pepsin extraction of hypertrophic car- 
tilage. 4 ' 19 Site-directed mutagenesis of human type X 
collagen has been used to determine the role of each 
domain in molecular assembly and secretion. 20 Three 
mutants with mutations in the C-terminal NC1 domain 
that are similar to those found in patients with Schmid 
type metaphyseal chondrodysplasia (see below), were 
unable to assemble into homotrimers in vitro or in vivo 
and were not secreted from cells. 20 In-frame deletions 
within the triple-helical domain did not prevent molecu- 
lar assembly and secretion of pepsin-resistant triple- 
helical molecules. 20 

■Antibodies 

Polyclonal antibodies against bovine type VIII collagen 
have been used for expression studies and immuno- 
blots. 11 - 21 ' 22 Monoclonal antibodies against the bovine 
a1(VIII) chain have been used for immunoelectron 
microscopy to demonstrate that the backbone within 
Descemet's membrane is composed of type VIII colla- 
gen. 12 These antibodies are commercially available. 
Polyclonal antibodies against sheep type VIII collagen 
have also been produced. 10 A conformation-dependent 
monoclonal antibody, X-AC9, against chicken type X col- 
lagen has been used extensively for investigations of the 
tissue distribution, ultrastructure, and thermal stability 4 . 
Several other antibodies against chicken, mouse, bovine, 
and human type X collagens are available. 15 - 23 " 25 

■Activities 

Type VIII collagen is the major constituent of the hexago- 
nal lattice observed in Descemet's membrane, as demon- 
strated by immunoelectron microscopy. 12 It is possible 
that the general function of type VIII collagen is to 
provide an open, porous structure that can withstand 
compressive force. Type X collagen in hypertrophic carti- 
lage may play a similar rol§ by providing a scaffold to 
prevent local collapse as the hypertrophic cartilage matrix 
is removed during endochondral ossification. 13 - 26 - 27 
Several observations suggest a positive and/or negative 
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roie for type X coliagen i^^ie calcification of hyper- 
trophic cartilage. 4 

I Genes 

The primary structures of the al(VIII) and a2(VIII) chains 
are strikingly similar to that of a1(X) collagen. cDNA and 
genomic DNAs encoding rabbit, human, and mouse 
al(VIII) and a2(VIII) chains have been isolated and charac- 
terized. 3 * 6 * 7 ' 28 The human a1(VIII) and a2(VIII) genes are 
located on chromosomes 3 and 1, respectively. 3 - 28 The 
chicken type X gene was the first to be isolated among 
the short chain collagen genes. 8 - 28 The bovine, 29 
mouse, 3<h22 and human 32 " 34 type X genes were subse- 
quently sequenced and characterized. The COL10A1 gene 
has been localized to human chromosome 6q21-q22; 35 
the mouse gene is on chromosome 10. 33 

■ Mutant phenotype/disease states 

Mice carrying a transgene encoding an a1(X) collagen 
chain with an in-frame deletion in the triple-helical 
domain developed skeletal abnormalities within 2-3 
weeks after birth. 36 Histology showed a decrease in the 
width of the zone of hypertrophic chondrocytes in 
growth plates, decreased bone formation in the meta- 
physes of long bones, and bone marrow abnormalities. 36 
Craniofacial abnormalities were also noted. 37 Although 
an initial study of type X collagen null mice reported no 
phenotypic abnormalities, 38 subsequent studies 24 of mice 
that are homozygous for inactivated CoHOal alleles show 
distinct growth plate abnormalities. Recent analyses (O. 
Jacenko, personal communication) indicate that these are 
similar to those seen in the type X collagen transgenics. 
These findings in mice are consistent with the demonstra- 
tion that Schmid metaphyseal chondrodysplasia (OMIM 
1 56500), an autosomal dominant disorder in humans with 
short stature and growth plate abnormalities, is caused 
by mutations in the COL10A1 gene. Since the first discov- 
ery of a frame-shift-causing deletion in the C-terminal 
NC1 domain, 39 a large number of mutations in the NC1 
domain of type X collagen have been found in patients 
with Schmid metaphyseal chondrodysplasia. 40-44 Except 
for one report describing mis-sense mutations in the N- 
terminal NC2 domain, 45 all mutations have been in the 
NCI domain. Studies on the molecular assembly and 
secretion of mutant polypeptides 20 support the initial 
hypothesis 39 that Schmid metaphyseal chondrodysplasia 
is caused by haploinsufficiency for type X collagen. 
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Basement membrane Collagens 



Type IV collagen is the major collagenous component of 
basement membranes, forming a network structure with 
which other basement membrane components (laminin, 
nidogen, heparan sulphate proteoglycan) interact. Six dis- 
tinct genes are identified as belonging to the type IV col- 
lagen gene family. They form three pairs of genes on 
three different chromosomes; within each pair the genes 
are arranged head-to-head and regulated by a bidirec- 
tional promoter. 

Collagen molecules, composed of two a1(IV) and one 
ac2(IV) chain, have long been recognized as a major com- 
ponent of basement membranes. 1-3 Each of the two 
chains is about 1700 amino acid residues long and con- 
tains at least three distinct domains: an N-terminal cys- 
teine-rich (7S) domain, a central triple-helical domain, 
and a C-terminal non-triple-helical domain (NO). 1 Type 
IV molecules assemble into a network which is quite 
different from the banded fibrils formed by fibrillar 
collagen types (Fig. 1). Within the network, separate 
molecules are covalently cross-linked within laterally asso- 
ciated 7S domains and associated by end-to-end interac- 
tions through their NC1 domains. 1,2,4 Lateral associations 
between the triple-helical domains also contribute to the 
network structure 1,2 (Fig. 1). 

While type IV collagen molecules, composed of a1(IV) 
and a2(IV) chains, are broadly expressed, molecules con- 
taining combinations of four additional chains, 
a3(IV)-a6(IV), are important components of specialized 
basement membranes. 4-9 In the kidney glomerular base- 
ment membrane, molecules of a1(IV), a2(IV) are replaced 
by a3(IV) r a4(IV), and a5(IV) chains as development pro- 



ceeds. 10 The a6(IV) chain is present in epidermal base- 
ment membranes, around smooth muscle cells and 
adipocytes/ and in Bowman's capsule and renal distal 
tubules, but absent from glomerular basement mem- 
branes. 11 The precise chain composition of triple-helical 
molecules assembled from the a3(IV)-a6(IV) chains is not 
entirely clear, but it is believed that a3(IV) and a4(IV) 
chains form heterotrimeric molecules. Also, analyses of 
bovine seminiferous tubule basement membranes have 
established a structural linkage between a3(IV) and 
a5(IV) chains. 12 a3(IV)/a4(IV) molecules and molecules 
containing a5(IV) chains may therefore be components of 
the same network. This helps to explain the observation 
that glomerular basement membranes from patients with 
Alport syndrome caused by mutations in a5(IV) (see 
below) are defective in a3(IV) and a4(IV). W4 

■ Purification and recombinant synthesis 

Fragments of type IV collagen can be extracted from 
basement membranes with pepsin (resulting in triple- 
helical fragments) or with bacterial collagenase (resulting 
in non-triple-helical domains). 1,15 Intact type IV collagen 
composed of a1(IV) and a2(IV) chains can be isolated by 
acetic acid extraction of murine EHS-tumour tissue, 1 and 
is commercially available. Pepsinized material is also com- 
mercially available. The BioSupplyNet Source Book con- 
tains a good listing of suppliers. The NCI domains of the 
human a1(IV)-a5(IV) chains have been synthesized in 
E. co// 16 and in insect cells. 17 
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Figure 1. Electron micrographs of type IV collagen monomer, dimer, tetramer, and supramolecular aggregate after 
rotary shadowing, and schematic illustrations of the structure, and supramolecular assembly of type IV collagen, 
(a) Each a chain contributes to the 400 nm long triple-helical (COL) domain. This contains a number of 
interruptions in the Gly-X-Y-repeat sequence (not shown). A globular non-triple-helical domain (NCI) is located at 
the C-terminal end. The 7S domain is at the N-terminal end. Three a chains form a triple-helical molecule. The 
triple-helical molecules are the building blocks (monomers) of the basement membrane meshwork. Monomers 
associate into doners that are stabilized by disulphide bonds between NCI domains (b) or tetramers that are 
stabilized by disulphide bonds between the N termini (c). The supramolecular network is formed by assembly of 
dimers, and tetramers, and strengthened by lateral associations between molecules (d). Other basement membrane 
components such as laminin, proteoglycans, and nidogen are incorporated into the type IV collagen meshwork. 
(Courtesy of Dr Eijiro Adachi, School of Medicine, Kitazato University.) 



■Antibodies 

A variety of antibodies are available. 1 - 10 - 18 - 19 These include 
antibodies against the 7S and NC1 domains, as well as 
antibodies against pepsin fragments. 1 Both polyclonal 
and monoclonal antibodies against type IV collagen are 
commercially available from several sources. FITC-anti- 
a5(IV) and Texas red-anti-ar2(IV) antibodies for diagnosis 
of Alport syndrome are commercially available; a good 
listing of suppliers can be found in the BioSupplyNet 
Source Book. 



■ Activities 

Type IV collagen can interact with cells indirectly through 
laminin. Strong binding of type IV collagen to laminin is 
mediated by nidogen/entactin, 20 - 21 a glycoprotein of 
about 150 kDa which binds tightly to laminin 22 - 23 and has 
binding sites also for type IV collagen and cells. 3 In addi- 
tion, direct low affinity interaction between laminin and 
type IV collagen is possible. 2 - 24 Type IV collagen also binds 
to heparin and heparan sulphate proteoglycan 2 - 25 " 27 and 
heparin can inhibit type IV collagen polymerization. 25 
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toany cell types adhere to type IV collagen, 128 and pep- 
tides from within type IV sequences can inhibit this adhe- 
sion 29 . A major cell binding site in a1(IV)/a2(IV) 
heterotrimers is localized about 100 nm from the N ter- 
minus of the molecule; this triple-helical binding site 
interacts with a1£1 and a102 integrins on cells. 4 
Recombinant fibulin-2 has a weak affinity for type IV col- 
lagen, but binding of nidogen to immobilized fibulin-2 
allowed the formation of ternary complexes with colla- 
gen IV. 30 

■ Genes 

Six distinct type IV collagen genes have been identified. 
These are organized in three sets COL4A1/COL4A2, 
COL4A3/COL4A4, and COL4A5/COL4A6 which in humans 
are localized on three different chromosomes, 13, 2, and 
X, respectively (Fig. 2). 4 Within each set the genes are 
arranged head-to-head and their expression is regulated 
by bidirectional promoters between the genes. The 5' 
ends of the genes overlap; the transcription start sites are 
separated only by 130 bp in human 31 ' 32 and mouse 33 ' 34 
a1(IV) and a2(IV) genes. The transcriptional regulation of 
COL4A1/COL4A2 is well characterized. 35 - 36 Transcription 
of COL4A6 seems to be controlled by two alternative pro- 
moters. 37 The complete primary structures of mouse and 
human a1(IV) and a2(IV) chains have been deduced from 
cDNA sequences, and mouse and human genomic clones 
have been extensively characterized. 38- * 1 The human 
a3(IV) and o4(IV) genes, located on chromosome 2, have 
also been well characterized. 42 * 44 The primary structure of 



the human a5(IV)^Pti a6(IV) chains has been established 
by sequencing of cDNAs 8 - 9 and genomic clones. 37 * 45 - 46 

Type IV collagen genes have been characterized in 
several invertebrates, such as Drosophila* 1 Caenor- 
habditis elegans, 48-51 Ascaris suurn, 52 and sea urchin, 
Strongylocentrotus purpuratus. SXSA The protein encoded 
by the a1(IV) collagen gene in Drosophila is quite similar 
to the vertebrate type IV collagen chains, but the gene 
has fewer exons and is smaller than the corresponding 
vertebrate gene. 47 In C elegans the clb-1 and c/b-2 genes 
are homologous to the vertebrate a1(IV) and a2(IV) colla- 
gen genes; 48 however, these genes are located on sepa- 
rate chromosomes. Interestingly, mutations in the a1(IV) 
gene in C. elegans result in temperature-sensitive I- 
ethality during late embryogenesis. 49 

I Mutant phenotype/disease states 

Homozygous, Co/4a3-null mice show a phenotype that is 
similar to Alport syndrome in humans. 55 ' 56 Decreased 
glomerular filtration leads to uraemia, changes in the 
glomerular basement membrane causes proteinuria, and 
glomerulonephritis develops. Histological and molecular 
analyses indicate that the absence of a3(IV) chains causes 
loss of a4(IV) and a5(IV) chains from the glomerular base- 
ment membrane, and leads to increased levels of type VI 
collagen and perlecan, as well as retention of a1(IV) and 
a2(IV) chains. 5556 Canine X-linked hereditary nephritis is 
an animal model of human X-linked Alport syndrome, 
and has been shown to be caused by a premature stop 
codon in the a5(IV) collagen chain. 57 
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Figure 2. Illustration of the organization, and chromosomal locations of the human type IV collagen genes. The 
genes coding for the six type IV collagen chains are located in pairs in a head-to-head manner on three different 
chromosomes. The genes are depicted as rectangles, and the flanking regions by horizontal lines. For the COL4A6 
gene, the locations of exons are indicated by vertical bars, and introns by horizontal lines. The exons are numbered 
from the 5' end of the gene. Introns of unknown sizes are indicated by ellipsoids. Note that the first two exons, El' 
and 1, of the gene are alternatively utilized, and spliced to exon 2 as indicated by free lines. 
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Three differeni i;uman^reases directly involve type IV 
collagen genes or their translation products. Goodpasture 
syndrome (OMIM 233450), an autoimmune disorder 
causing progressive glomerulonephritis and pulmonary- 
haemorrhage, is caused by antibodies that bind to an 
antigen (the Goodpasture antigen) in basement mem- 
branes of kidney glomeruli and lung alveoli. 58 The 
Goodpasture antigen is the NC1 domain of a3(IV) colla- 
gen chains. 5960 Dimers of a3(IV) NC1 domains, isolated 
from bovine kidney, can induce an autoimmune response 
in rabbits similar to Goodpasture syndrome. 61 Mutations 
in COL4A5, located on the X chromosome, have been 
demonstrated in more than 200 cases of X-linked Alport 
familial nephritis (OMIM 301 050). 62 In cases of autosomal 
recessive Alport syndrome (OMIM 203780), mutations 
have been identified in the a3(IV) and a4(IV) genes. 63 In 
rare cases of diffuse leiomyomatosis associated with 
Alport syndrome, large deletions involving the aS(\V) and 
a6(IV) genes have been found 64 . Autosomal dominant 
benign familial haematuria (OMIM 141200), character- 
ized by thinning of the glomerular basement membrane 
and normal renal function, has been linked to the 
COL4A3/COL4A4 locus and shown to be caused by a 
mutation in COL4A4* s 
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The non-flbrillar collagens type XV and type XVIII are 
br adly expressed in many tissues, but are present at par- 
ticularly high levels in internal organs. They contain mul- 
tiple short triple-helical domains, separated and flanked 
by non-triple-helical regions. Based on a considerable 
degree of similarity in some of their structural domains, 
they are classified as members of a novel subfamily of 
collagens called multiplexes. 

The two members of this class of proteins, types XV 
and XVIII collagen, have been given the name multiplex- 
ins 1 because they both contain multi ple-triple-helix 
domains with interruptions. They share considerable 
homology at the amino acid level, but are sufficiently 
different to rule out the possibility that they could 
form mixed heterotrimers like types V and XI fibrillar 
collagens. 

First isolated by cross-hybridization during screening of 
a placental cDNA library, 2 type XV collagen has now been 
completely characterized at the nucleotide level. 3 ' 4 
al(XV) collagen chains contain nine triple-helical (COL) 
domains that are separated and flanked by non-triple- 
helical (NC) regions (Fig. 1). The N-terminal region 
(NC10*) consists of 530 amino acid residues and is almost 
as large as the triple-helical region; the C-terminal non- 



triple-helical region (NCI*) is somewhat smaller (256 
amino acid residues). 

The a1 (XVIII) collagen chain contains 10 triple-helical 
domains, separated and flanked by non-triple-helical 
sequences 1 - 5 (Fig. 1). A comparison with a1(XV) shows a 
striking similarity in size between the six most C-terminal 
triple-helical domains of the two collagens. 1 - 6 Also, at the 
amino acid level there is over 60 per cent identity 
between the carboxyl half of the 31 5 residue NC1 domain 
of a1 (XVIII) and the corresponding portion of a1(XV). 7 
Both collagens contain four cysteinyt residues in this 
region and may therefore have a similar tertiary struc- 
ture. Another region of homology is a 200 residue 
sequence at the amino end of the short variant (see 

* For type IV basement membrane collagens, FACIT collagens, 
and short-chain collagens the numbering of triple-helical and 
non-triple-helical domains starts by counting from the C termi- 
nus of the molecule. In keeping with this tradition, this num- 
bering system has been used also for types XV 3 and XVIII 1 
collagen and is followed here. Numbering the domains from 
the N terminus has been suggested for alQCV) and a1 (XVIII) col- 
lagens, 5,7 but serves only to create confusion and should be 
avoided. 
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Figure 1. Diagram showing the domain structures of 
types XV and XVIII collagen chains. Non-triple-helical 
domains are indicated by rectangles; triple-helical 
domains are indicated by a solid line. Thin vertic \\ lines 
between the two chains delineate regions of homology ' 
in the NC10/NC11, and NCI regions of the two chains. 
Three variant transcripts give rise to three difFerent 
al (XVIII) chains with different NC11 domains. These 
are indicated as the short, intermediate, and long 
forms. The frizzle d-like region in the long form is 
indicated by the stippled rectangle. 



below) of a1(XVIil) and the corresponding region of 
al(XV). 3 This sequence is 45 per cent identical between 
the two collagens and is homologous with a region of 
thrombospondin-1, the fibrillar procollagens V and XI, 
and members of the FACIT group. 7 

Northern blot analyses show that types XV and XVIII col- 
lagens are expressed in several major internal organs and 
in several cell types, including fibroblasts and endothelial 
cells. 1 ' 3 ' 5 - 8 There is considerable overlap between the 
expression patterns of the two collagens, but also distinct 
differences. For example, while both transcripts are found 
in the kidney, or1(XV) transcripts are low in lung and liver 
while those of a1 (XVIII) are very high, particularly in liver. 
Immunohistochemical studies demonstrate a wide distrib- 
ution of the two collagens, with a particular concentra- 
tion in basement membrane regions. 9,10 

■ Purification and recombinant synthesis 

A portion of the NCI domain of type XV collagen has 
been expressed as a recombinant protein in bacteria and 
used for generation of specific polyclonal antisera. 14 
Fragments of type XVIII collagen have been produced 
both in bacteria and in insect cells. 15 



■ Antibodies 

Antibodies have been generated against both types XV 
and XVIII collagens and used for Western blotting and 
immunohistochemical studies. 2 - 9 - 10 * 14 

■Activities 

The supramoiecular assemblies and functions of multi- 
plexes have not been characterized. Of considerable 



interest, TWVvever, is the finding that a 20 kDa angiogen- 
esis inhibitor from a murine haemangioendothelioma, 
called endostatin, represents a fragment of the carboxyl 
region of the NC1 domain of a1(XVIIl) collagen chains. 1S 
This portion of type XVIII collagen, produced as a recom- 
binant peptide and injected into mice, causes nearly com- 
plete suppression of tumour-induced angiogenesis and 
tumour growth. 1S 



■ Genes 

Genomic and cDNA sequences for mouse and human 
a1(XV) and a1 (XVIII) collagens are available. 1 " 6 ' 912 - 14 * 16 Of 
interest is that the a1 (XVIII) collagen gene contains two 
alternative promoters and that transcripts from one of 
these promoters can be alternatively spiked. This gives 
rise to three alternative a1 (XVIII) transcripts that encode 
a1(XVIII) collagen chains with very different N-terminal 
(NC11) non-triple-helical domains 9 - 12 (Fig. 1). The shortest 
variant, transcribed from the most 5' promoter, contains 
mostly the thrombospondin-1 homology region. An inter- 
mediate-sized variant, transcribed from the most 3' pro- 
moter, contains a different signal peptide and an 
additional region of about 200 residues that is rich in 
acidic amino acid residues. The longest variant, also tran- 
scribed from the most 3' promoter, contains in addition a 
250 residue region inserted between the acidic domain 
and the thrombospondin-1 homology region. This 
inserted region contains 10 cysteinyl residues and shows a 
striking similarity to the extracellular ligand-binding 
domain of frizzled receptors, with a frizzled-like distribu- 
tion of the cysteines. 9 - 12 



■ References 

1. Oh, S. P., Kamagata, Y., Muragaki, Y„ Timmons, S., Ooshima, 
A., and Olsen, B. R. (1994). Proc Natl Acad. 5c/. USA, 91, 
4229-33. 

2. Myers, J. C, Kivirikko, S., Gordon, M. K., Dion, A. S., and 
Pihlajaniemi, T. (1992). Proc Natl Acad. ScL USA, 89. 
10144-8. 

3. Muragaki, Y., Abe, N., Ninomiya, Y., Olsen, B. R., and 
Ooshima, A (1994). J. Biol. Chem., 269, 4042-6. 

4. Kivirikko, S., Heinamaki, P., Rehn, M„ Honkanen, N., Myers, 
J. C, and Pihlajaniemi, T. (1994). J. Biol. Chem., 269, 
4773-9. 

5. Rehn, M. and Pihlajaniemi, T. (1994). Proc. Natl Acad. ScL 
USA, 91, 4234-S. 

6. Rehn, M., Hintikka, E., and Pihlajaniemi, T. (1994). / Biol. 
Chem., 269, 13929-35. 

7. Pihlajaniemi, T. and Rehn, M. (1995). Prog. Nud. Acids Res. 
Mol. BioL, 50, 225-62. 

8. Kivirikko, 5., Saarela, J., Myers, J. C, Autio-Harmainen, H., 
and Pihlajaniemi, T. (1995). Am. J. Pathol., 147, 1500-9. 

9. Muragaki, Y. # Timmons, S., Griffith, C. M., Oh, S. P., Fadel, B., 
Quertermous, T., and Olsen, B. R. (1995). Proc. Natl Acad. ScL 
USA, 92, 8763-7. 

10. Hagg, P. M., Hagg, P. O., Peltonen, S., Autio-Harmainen, K, 
and Pihlajaniemi, T. (1997). Am. J. PathoL, 150, 2075-86. 

11. Oh, S. P., Warman, M. L, Seldin, M. F., Cheng, S. D., Knoll, J. 
K, Timmons, S., and Olsen, B. R. (1994). Genomics, 19, 494-9. 



400 Multiplexes 



12. Rehr., M., and Pihlajaniemi, T. (1995).^rB/o/. Chem., 270, 
470S-11. 

13. Rehn, M., Hintikka, E., and Pihlajaniemi, T. (1996). Genomics, 
32. 43S-46. 

14. Myers, J. C, Dion, A. S., Abraham, V., and Amenta, P. S. 
(1 996). Cell Tissue Res.. 286, 493-505. 

15. O'Reilly, M. S., Boehm, T., Shing, Y., Fukai, N., Vasios, G., 
Lane, W. S., et al. (1 997). Cell, 88, 277-85. 

16. Hagg, P. M., Muona, A., Lietard, J., Kivirikko, S., and 
Pihlajaniemi, T. (1998). 7. Biol. Chem., 273, 17824-31. 



Bjorn Reino Olsen 

Department of Cell Biology, 

Harvard Medical School, and Harvard-Forsyth 

Department of Oral Biology, Harvard School 

of Dental Medicine, 

Boston, MA USA 

Yoshifumi Ninomiya 
Department of Molecular Biology and 
Biochemistry, Okayama University 
Medical School, 2-5-1 Shikata-cho, 
Okayama 700, Japan 



Collagens with transmembrane domains - 



Types XIII and XVII collagen are cell surface molecules 
with multiple extracellular triple-helical domains, con- 
nected to a cytoplasmic region by a transmembrane 
segment. They represent a new class of cellular adhesion 
molecules by which cells are connected to extracellular 
matrix. Type XIII collagen is expressed on the surface of 
fibroblasts, while type XVII collagen is a component of 
hemidesmosomes in epithelial cells. 

Although the overall structures of types XIII and XVII 
collagen are quite different, it is reasonable to include 
them in a separate group of collagenous proteins, based 
on their membrane association. In analogy with the term 
FACIT for fibril-associated collagens, the type XIII/XVII 
group has therefore been designated the MACIT (mem- 
brane-associated collagens with interrupted Jriple- 
helices) group. 1 As discussed in the overview of the 
collagen superfamily (pp. 380-382), one can also include 
the macrophage scavenger receptors 2 and MARCO 3 in this 
group of proteins (Fig. 1). 

Type XIII collagen, initially identified by cross-hybridiza- 
tion during screening of a human cDNA library with a 
mouse type IV collagen probe, 4 is encoded by a gene that 
gives rise to a number of transcripts by alternative splic- 
ing. 5 These transcripts encode a polypeptide chain with 
three triple-helical domains, separated by non-triple- 
helical regions. In the different splice variants the length 
of the N- and C-terminal triple-helical domains varies con- 
siderably. It is believed that type XIII molecules are 
homotrimers; how the synthesis of the different variants 
can be reconciled with trimerization and the proper 
folding of triple-helical domains is not clear. 1 Type XIII 
collagen is widely expressed jn human tissues and cell 
lines. Western blots of extracts of human HT-1080 
fibrosarcoma cells show the presence of bands of 
expected size (about 67 and 54 kDa), 4 and these antibod- 



ies show localization at discrete sites along the cell 
surface (Fig. 2). By in situ hybridization, a1(XIII) tran- 
scripts have been found in epidermis and hair follicles, 
muscle, intestinal wall, cartilage, arid bone. 6 In placenta, 
stromal cells of the villi, endothelial cells of developing 
capillaries, and cells of the cytotrophoblastic columns are 
all positive for type XIII collagen transcripts. 7 

Type XVII collagen is a component of hemidesmosomes 
and represents the autoantigen BPAG2, causing an 
acquired blistering skin disease, bullous pemphigoid. 8 ' 9 
Sequencing of chicken, mouse, and human a1(XVII) colla- 
gen cDNA shows that it contains a large cytoplasmic N- 
terminal domain (almost 500 amino acid residues) with 
an extracellular triple-helical region consisting of eight 
heptad repeats, likely to form a coiled-coil trimer, and 15 
short triple-helical domains separated by non-triple- 
helical regions. 1 *- 13 Rotary shadowing of affinity-purified 
type XVII collagen isolated from bovine mammary gland 
epithelial cells showed a structure composed of a globu- 
lar head, a central rod, and a flexible tail. 14 It is likely that 
the globular domain is the cytoplasmic region, the central 
rod is the heptad repeat region, and the flexible tail 
represents the interrupted triple-helical domain. 
Immunoelectron microscopy shows that type XVII colla- 
gen is a hemidesmosomal component with the extracellu- 
lar domains localized in the anchoring filaments between 
the cell surface and the lamina densa of the underlying 
basement membrane. 15 



I Purification and recombinant synthesis 

Type XIII collagen has not been isolated as a protein from 
tissues, but type XVII has been isolated from primary 
human keratinocytes, HaCaT cells, and bovine corneal 
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Figure 1. Diagrams comparing the domain structures of membrane-associated polypeptides containing collagenous 
sequences. The numbering of non-triple-helical, and triple-helical domains is shown above the corresponding 
polypeptide. Filled rectangles indicate non-triple-helical domains; open rectangles indicate triple-helical domains. 
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The scale below is given in amino acid residues, counted from the transmembrane domain. (Modified from 
Pihlajaniemi and Rehn (1995), 1 courtesy of Dr T. Pihlajaniemi.) 



epithelial cells. 1 *" 18 Mouse Balb/K keratino'y.es were 
transfected with a full length type XVII collagen cDNA 
and shown to assemble as a triple-helical homotrimer. 17 A 
portion of the extracellular domain of type XVII collagen 
has been expressed as a recombinant protein in insect 
cells. 15 

■Antibodies 

Antipeptide antibodies are being used to study the 
expression and cellular localization of type XIII collagen. 1 
About half the sera from patients with bullous pem- 
phigoid and most sera from patients with herpes gesta- 
tionis contain autoantibodies against type XVII collagen. 
Monoclonal antibodies recognizing both the extracellu- 
lar and intracellular domains are available and have been 
used for immunofluorescence. Western blotting, and 
immunoelectron microscopy. 16 

■ Activities 

Type XIII collagen is expressed at focal adhesion sites in 
cultured fibroblasts and may therefore represent a 
matrix-binding anchoring molecule at such sites (Fig. 2). 



Type XVII is part of the multiprotein hemidesmosome 
complex that mediates adhesion of epithelial cells to the 
underlying basement membrane. 19 Transfection experi- 
ments with various mutant cONAs suggest that the local- 
ization of type XVII collagen in the hemidesmosome is 
mediated by the cytoplasmic domain and requires inter- 
action with sequences in the cytoplasmic domain of the 
04 integrin subunit. 20 

■ Genes 

cDNA and genomic clones for mouse and human a1(XIII) 
collagen are available. 21-24 Alternative splicing gives rise 
to multiple transcripts of 2.5-2.8 kb. 5 - 25 For type XVII col- 
lagen cloning of chicken, mouse, and human cDNAs have 
been reported. 10-13 The entire human COL17A1 gene has 
also been characterized. 26 



■ Mutant phenotype/disease states 

No mutations in type XIII collagen are known. In contrast, 
several mutations in COL17A1 have been demonstrated 
in patients with generalized atrophic benign epidermoly- 
sis bullosa (OMIM 226650). 27-29 This is a rare non-lethal 
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Figure 2. Primary human skin fibroblasts stained with 
rabbit polyclonal antibodies against type XIII collagen 
(top), and with monoclonal antibody against vinculin 
(bottom). 



variant of junctional epidermolysis bullosa, usually inher- 
ited as an autosomal recessive disorder, that can be 
caused by mutations in the 03 chain of laminin-5 30 in 
addition to mutations in a1(XVII) collagen. Most of the 
type XVII collagen mutations described have resulted in 
premature termination codons in both alleles within the 
largest triple-helical subdomain. In a Finnish family, the 
proband was a compound heterozygote, with one allele 
containing a 5 bp deletion and the other a nonsense 
mutation. 26 Homozygosity for a mis-sense mutation in 
type XVII collagen has also been demonstrated in a 
patient with the localisata variant of junctional epider- 
molysis bullosa. 31 Detailed and updated information is 
available through the OMIM database. 
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This is a heterogeneous group of proteins that on a 
genetic basis do not belong to one of the defined colla- 
gen families. They are discussed here as a group only for 
practical reasons. As the human genome project moves 
forward it is possible that identification of additional 
collagen genes will allow classification of the two colla- 
gens discussed below as members of their own distinct 
families. 



Type VI collagen 



Type VI collagen is broadly expressed in different tissues 
as the major component of beaded microfibrils. 1 Each 
type VI molecule appears in the electron microscope as a 
105 nm long triple-helical rod flanked by two globular 
domains, 2 and contains three different polypeptide sub- 
units a1(VI), a2(VI), and a3(VI). The three chains have 
apparent molecular masses of about 140, 130 and 
250-350 kDa respectively. 1 * 3 

These heterotrimeric type VI molecules form disulphide 
bonded dimers and tetramers. The tetramers associate 
end-to-end and generate microfibrils, which have a char- 
acteristic periodicity of 100 nm.** The complete primary 
structures of the a1(VI), a2(VI), and the a3(VI) chains have 
been determined from amino acid and cDNA sequenc- 
ing. 9-19 The chains contain a central, relatively short 
triple-helical domain of 335-336 amino acid residues. All 
three chains contain a C-terminal non-triple-helical 
domain composed of two repeats of a 200 residue long 
segment that is homologous to the A domains of von 
Willebrand factor. The a3(VI) chain contains in addition a 
proline-rich region showing homology with domains in 
salivary proteins, a fibronectin type 3 repeat-like domain, 
and a domain that is similar to a region found in serine 
protease inhibitors of the Kunitz type. 14 - 15 In the N-termi- 
nal region of the a1(VI) and a2(VI) chains there is a single 
200 residue long von Willebrand factor A homology 
domain, while the a3(VI) chain contains up to nine such 
repeats in this region. Binding sites for type I collagen 
have been ascribed to the von Willebrand factor A 
region, 20 and it is possible that the homologous domains 
in type VI collagen have collagen binding properties as 



well. It is also possible that type VI collagen has a cell 
adhesion function; 21 several Arg-Gly-Asp sequences are 
found in the primary sequence of the type VI collagen 
subunits and experiments with neural crest cells suggest 
that regions in the N- and C-terminal globular domains 
play a role in cell adhesion and migration. 22 

Alternative splicing of exons in the 5' region of the 
a3(VI) collagen gene leads to the formation of several 
transcripts encoding polypeptides with N-terminal 
globular domains of different size. 18 ' 23-26 Splice variations 
in the 3' region of the ot2(VI) gene affecting the structure 
of the C-terminal globular domain have also been 
described. 10 - 17 * 27 



■Purification and recombinant synthesis 

The triple-helical portion of type VI collagen can be 
obtained by differential precipitation with NaCI from 
pepsin digests of various tissues in acetic acid or formic 
acid. Further purification can be accomplished by repre- 
cipitation through dialysis against 0.02 M NajHPO* 
followed by ion exchange or molecular sieve chromatog- 
raphy. 11 - 28 Intact type VI collagen can be purified by ion 
exchange and molecular sieve chromatography of 
guanidine or urea extracts of tissues or cell cultures. 1 - 29 
Procedures for isolating intact type VI collagen-contain- 
ing microfilaments have also been described. 30 

The C-terminal Kunitz-type domain of o3(VI) chains has 
been generated as a recombinant protein and used for 
structural studies. 31 A large portion of the N-terminal 
globular domain of or3(VI) has also been synthesized as a 
recombinant protein. 32 All three type VI collagen a chains 
have been expressed as recombinant proteins in murine 
NIH/3T3 cells and shown to assemble into monomers, 
dimers, and tetramers. 33 



■ Antibodies 

Several polyclonal and monoclonal antibodies are avail- 
able against type VI collagen. 1 - 34 ~ 38 They have been used 
for detecting type VI chains or degradation products by 
immunoblotting, immunoprecipitation, and immunohis- 
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tochemistry* 1 Monoclonal antibodies have been used 
for epitope mapping by rotary shadowing electron 
microscopy. 1 - 39 Anti-type VI collagen antibodies are avail- 
able from several commercial sources. See BioSupplyNet 
Source Book for suppliers. 

■ Activities 

Type VI collagen molecules assemble into disulphide 
bonded polymers that form beaded microfibrils. 1 * 40 - 41 The 
microfibrils frequently aggregate further laterally into 
cross-banded fibres, referred to as Luse bodies, fusiform 
bodies, or zebra collagen. 1 - 42 * 43 Type Vi collagen binds to 
hyaluronan; 44 binding sites for heparin and hyaluronan 
have been identified within the N-terminal globular 
domain of a3(VI) chains. 32 Type VI collagen also binds to 
the membrane-associated chondroitin sulphate pro- 
teoglycan NG2 45-47 and interacts with the microfibril- 
associated glycoprotein-1. 48 



■ Genes 

cDNAs encoding type VI collagen chains in humans, 
chicken, and mouse have been isolated and 
sequenced. 19 * 49 * 50 The cr2(VI) gene generates transcripts 
that are alternatively spliced at the 3' end, giving rise to 
several mRNA variants. 3 Several variants are also gener- 
ated by alternative splicing in the 5' region of a3(VI) tran- 
scripts. The human COL6A1 and COL6A2 genes are 
organized in a head to tail arrangement on chromosome 



21q22.3, 51 and both genes have been characterized and 
compared with the corresponding chicken genes. 52-54 



■ Mutant phenotype/disease states 

Jobsis et al. (1996) 55 demonstrated linkage to the 
COL6A1ICOL6A2 locus on chromosome 21q22.3 in nine 
kindreds with the Bethlem form of autosomal dominant 
myopathy with contractures (OMIM 158810). A mis-sense 
mutation involving a glycine residue in the triple-helical 
domain was found in COL6A1 in one family and in 
COL6A2 in two other families. 55 Analysis of a large French 
Canadian family showed linkage to the COL6A3 locus on 
chromosome 2q37. 5€ Pan et al. have described a missense 
mutation in COZ.6A?. 93 



■ Structure 

The crystal structure of the Kunitz-type domain in the C- 
terminal region of a3(VI) chains has been determined at 
a 1.6 A resolution, 57 and the solution structure and back- 
bone dynamics of the domain has been analysed by 
NMR. 58 



Type VII collagen 



Type VII collagen is the major collagenous component of 
anchoring fibrils associated with the basement mem- 




Figure 1. Ultrastructural immunolocaiization of type VII collagen within the dermal-epidermal junction of neonatal 
human foreskin with gold-conjugated antibodies. AF f anchoring fibrils; AP, anchoring plaques. (From Keene et al. 
1987. 62 ) 
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branes under stratified squamous epithelia. 5 *"* 1 The fibrils 
originate from the lamina densa and extend into the 
upper papillary dermis of skin where they insert into so- 
called anchoring plaques 62 (Fig. 1). Anchoring fibrils also 
connect anchoring plaques. Type VII collagen molecules 
are homotrimers containing a triple-helical domain that is 
about 50 per cent longer than the triple helix of fibrillar 
collagens. This domain is flanked by relatively large N- 
and C-terminal non-triple-helical domains, of molecular 
masses 150 and 30 kDa, respectively. 60 - 61 The 30 kDa 
domain is proteolytically cleaved extracel Marly, and the 
processed molecules form antiparallel dimers, through a 
C-terminal overlap region. (On the basis of the initial 
protein studies, the large globular domain was erro- 
neously identified as the C-terminal domain; 63 molecular 
cloning later showed that the large globular domain was 
at the N terminus. 64 ) Lateral aggregation of such dimers 
leads to the formation of the centro-symmetrically 
banded anchoring fibrils. 59 

Keratinocytes are the cells of origin for type VII collagen 
in skin, 65 and proteolytic processing of the C-terminal glob- 
ular domain precedes assembly of anchoring fibrils. 66 The 
N-terminal globular domain has a modular structure, 67 
including nine fibronectin type Ill-like repeats and a von 
Willebrand factor A-like module. 68 The C-terminal globular 
domain contains eight cysteines; six of these are contained 
within a module that is similar to the Kunitz-type module 
in the C-terminal region of a3(VI) collagen chains. 69 

■ Purification and recombinant synthesis 

The triple-helical domain of type VII collagen can be solu- 
bilized by pepsin extraction of human skin or amnion. 
Purification is by differential salt precipitation with NaCI, 
followed ',y ion exchange chromatography and HPLC. 59 
The intact, biosynthetk form of type VII collagen has 
been purified from the media of KB cells (derived from a 
human oral basal cell carcinoma) and WISH cells (derived 
from amniotic epithelial cells). 59 Recombinant fusion pro- 
teins have been used for epitope mapping and detection 
of autoantibodies in patient sera. 70 " 72 

■ Activities 

Type VII collagen molecules form the anchoring fibrils in 
skin, chorioamnion, oral mucosa, cornea, and the uterine 
cervix. 59 Laminin-5, 73 a component of anchoring filaments 
and a ligand for the integrin a604 within hemidesmo- 
somes, binds to the N-terminal globular domain of type 
VII collagen. 74 Interactions of the N-terminal domain with 
fibronectin and type I collagen, 76 and between the type 
VII collagen triple-helical region and fibronectin, 76 are 
also likely. 

■ Antibodies 

A number of polyclonal and monoclonal antibodies 
against type VII collagen are available. 59 " 79 Auto- 



antibodie^Trom patients with acquired epidermolysis 
bullosa have been shown to react with specific epitopes 
in a1(VII) collagen chains. 70 " 72 - 80 

■ Genes 

Screening of a cDNA expression library with autoantibod- 
ies against type VII collagen from a patient with acquired 
epidermolysis bullosa resulted in the first isolation of 
human a1(VII) collagen cDNA. 81 This led to the isolation 
of cDNAs covering the entire mRNA, 68 - 69 and characteriza- 
tion of the entire human COL7A1 gene. 82 The mouse 
cDNA and Col7a1 gene has also been characterized. 83 - 84 



■Mutant phenotype/disease states 

The epidermolysis bullosa group of inherited blistering 
diseases in humans is classified into simplex, junctional, 
and dystrophic forms. The simplex forms are caused by 
mutations in keratins 5 and 14, 85-87 the junctional forms 
are caused by mutations in laminin-5, 88 and the dys- 
trophic forms are the consequences of mutations in type 
VII collagen. 88 The mutations in COL7A1 range from pre- 
mature termination codons resulting in severe, mutilat- 
ing recessive dystrophic epidermolysis bullosa of the 
Hallopeau-Siemens type (OMIM 226600) 89 * 90 to glycine 
substitutions in the triple-helical region of a1(VII) colla- 
gen resulting in clinically less severe, dominant or reces- 
sive, dystrophic epidermolysis bullosa. 91 A clinical variant 
of dominant dystrophic epidermolysis bullosa called the 
Bart syndrome (OMIM 132000) is caused by a glycine- 
substitution mutation in a1(VII) collagen. 92 An updated 
listing of all mutations in type VII collagen can be found 
in the OMIM database (OMIM 120120 collagen). 
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|J3ecor^ 



Decorin (DCN) is a small proteoglycan composed of a 
-38 kDa core protein usually modified with a single 
chondroitin sulphate (bone) or dermatan sulphate (most 
soft tissues) glycosaminoglycan chain and two or three N- 
linked oligosaccharides. DCN is virtually ubiquitous in the 
matrices of various connective tissues, being found bound 
to or 'decorating' the collagen fibrils. The protein portion 
is composed of 10 tandem repeats of -25 amino acids 
characteristically rich in ordered leucines with the repeats 
being flanked by two cysteine disulphide loops. These 
tandem repeats are found a wide variety of closely 
related small proteoglycans including: biglycan (BGN), 
fibromodulin, lumican, epiphycan, keratocan, and PG-Lb. 
The most commonly cited functions of DCN are its roles in 
collagen fibril assembly (and stabilization) as well as its 
ability to bind to TGF-/3. 



I Synonymous names 

Decorin has several synonymous names, most reflecting 
its relative position on SDS-PAGE or time of elution from 
various purification columns. The names includ PG40, 
PG-2, PG-II, PG-S2, CS-PGII, and DS-PGII. 



■ Protein properties 

Decorin is a member of a growing family of small proteo- 
glycans whose unifying characteristics are two highly con- 
served cysteine loops flanking 5 to 10 tandem repeats. 
Each repeat is nominally -25 amino acids in length and is 
based on the pattern LxxbcLxxNxLx^.^,. For DCN there 
are 10 repeats and the single glycosaminoglycan (GAG) 
chain is chondroitin sulphate in bone matrix and der- 
matan sulphate in most soft tissues. Other members of 
this family include biglycan, fibromodulin, lumican, epi- 
phycan, keratocan, and PG-Lb (known as DSPG3 in 
human) (for a review see ref. 1). The DCN sequences from 
a number of species have been reported, including 
human, 2 cow, 3 mouse, 4 rat, 5 rabbit, 6 and chicken. 7 
Curiously, the chicken form can have two GAG chains and 
these chains appear to be attached to a GlySer sites 
rather than the apparently universal mammalian 
Ser-Gly. 7 Using human DCN as the model, decorin has 359 
amino acids (-39 700 Da) including 17 in the leader 
sequence and 14 more in the amino terminus that are 
often removed and are therefore considered to be a 
propeptide region. 2 The 'mature' core protein (lacking 
the propeptide), made by removing the disaccharide 
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ABSTRACT We report on the isolation of mouse cDNA 
dooes wlikfa encode a colbkseo^ 

the id chain of type XVm collagen. Tbe overlapping dones 
cover 2.8 knobases and encode an open reading frame of 928 
acid reskhies comprising a putative signal peptide of 25 
raldnes, an ammo-terminal noocoOagenous domain of 301 
residues, and a primarily cottagenoos stretch of 602 residues. 
Tbe done* dp not cover, the carlMxy^teralttal end of the 
polypeptide, since the translation stop codcm is absent, Char- 
acteristic of tbe deduced polypeptide is tbe possession of eight 
n^collagenoiis n^ 

residues in the Collagenous amino acid sequence. Other features 
mdnde the presence of several putative sites for both N-linked 
grycosyUtion and O-linked tfyeosmminngyr** »*t~* ?mrtl f ^ 
homology of tbe ainiiKHerminal noocoDagenoiis domain with 
tlirombos{KmdiiL It is of partkularmtert^ that five of Untight 
sequences of type XVm show homology to the 
previously reported type XV collagen, suggesting that tbe two 
form a distinct subgroup among the divers* family of rnlh^n?. 
Northern blot hybridization analysis revealed a striking tissue 
distribution for type XVffl collagen mRNAs, as the dones 
hybridized strongly with mRNAs of 4* and 53 kOohases that 
were present only in rang and liver of tbe eight moose tissues 
studied. 

The coliagens comprise a large family of heterotrimeric or 
nomotnmcric triple-helical proteins that constitute the major 
structural components of the extracellular matrix. Several 
other proteins are known to contain short triple-helical col- 
lagen domains but are not classified as coliagens, as they do 
not participate in assembly of t^e extracellular matrix (1-3). * 
The - vertebrate coliagens can be divided into two groups; 
fibrillar and inxifibrillar, on the grounds of their primary 
structure and supramoJecular assemblies (1, 2). All collagen 
molecules contain a central collagen domain consisting of 
repealing Gly-Xaa-Yaa triplets and noncollagcDOus domains 
at their termini. The fibrillar group comprises the classical 
coliagens, types HH, and types V and XL These molecules 
contain collagenous domains of about 1000 aa, highly con- 
served carboxyMerminal noncoUagenous domains of about 
To? 9 ££ variab,c animo-tenninal noncoUagenous domains 
of 50-520 aa. The fibrillar coliagens participate in highly 
ordered quarter-staggered fibrils that provide tensile strength 
for the tissues. . 

The nonfibrillar group comprises collagen types IV, VI-X 
and XII-XVII (1-3). These molecules display great hetenK 
geneity in structure, tissue location, niacronx>lecnlar orga- 
nization, and function. On common feature is that they all 
have one or more interruptions in tbe collagenous sequence 
TTieir collagenous sequences vary in length from about 330 to 
1400 aa, tb shortest being found in type VI collagen mole- 

Tbc publication cost* of this article were defrayed in part by page chante 
payment This article must therefore be hereby marked " advertisement" 
in accordance with 18 U.S.C WM solely to indicate thh fact. 



coles and the longest in type VII. Their w^xyl-terimnal and 
ainino- terminal noncoUagenous domains also are highly vari- 
able in both sequence and length* the latter varying in both 
domains from <20 aa to several hundred amino acids. One 
. subgroup among the norjflbrilJar coliagens is formed by the 
fibril-associated coliagens with interrupted triple helices 
(FACIT): types IX, XE, and XIV (2). These coliagens share 
. sequence homology and do not appear to form polymers 
alone but are associated with fibrils composed of fibrillar 
coliagens-. Another subgroup is formed by the structurally 
homologous types VIII and X, which are thought to form 
sheets in the extracellular matrix (2). The recently described 
types XV (4), XVI (5), and XVII (6) differ from the other 
nonfibrillar coliagens in being characterized by numerous 
interruptions in their triple-helical regions. Type XVI colla- 
gen shares some structural features with the FACIT colia- 
gens, as also does another recently characterized form called 
Y-collagen (7, 8). Type XVII (6), a hemidesmosomal protein 
also known as the 180-kDa bullous pemphigoid antigen, is 
unique among the coliagens in that it is thought to be a 
transmembrane protein. 

Collagen types Xm and XV and the o5 chain of type IV 
collagen were identified in our laboratory by screening of 
cDNA tibraries under low stringency with probes encoding 
collagenous sequences (4, 9, 10); Recently we screened a 
mouse cDNA library to obtain clones coding for the mouse 
counterpart to the previously characterized human type XIII 
collagen (9). One of the positive clones was found to encode 
a collagenous protein not described before, t We present here 
a partial characterization of this polypeptide, which is char- 
... acterized by multiple interruptions in the triple helix, and 
suggest that it should be designated the dl chain of type 
XVm collagen. Our findings indicate that type XVm colla- 
gen has an unusual tissue location. Furthermore, type XVIII 
was found to be homologous with type XV, and the two thus 
form a subgroup among the coliagens. 

MATERIALS AND METHODS 

Isolation of cDNA Clones and DNA Sequencing. A 500-bp 
clone, G2, encoding the ol chain of murine type XIII collagen 
(unpublished results) was used as a probe to screen a mouse 
embryo Agtll cDNA library (ML 1027a, Qontech) under 
stringent conditions (11). The final wash for tbe filters was at 
50°C in 0.5x standard saline citrate (SSQ/0.1% NaDodS0 4 . 
The recombinant phage ME-1 was isolated and tb insert 
DNA was subcloned to the £coRI site of pBluescript SK 
(Stratagene). The nucleotide sequence was determined for 

Abbreviation: FACIT, fibril-associated coDagen(s) with interrupted 
triple helices. 

"To whom reprint requests should be addressed at: Department of 
Medical Biochemistry, University of Ouhi, FTN-90220 Ouhi, Fin- 
land. 

*The sequences reported in this paper have been deposited in the 
GenBank database (accession no. L16898). 
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both strands of the cDNA by the dideoxy nucleotide method 

(12) using the enzyme Sequenase (United States Biochemi- 
cal) and vector or insert-specific primers. The same library 
was screened with the ME-1 cDNA under stringent condi- 
tions as above but the final wash was at 65°C in 0.5 x SSC. 
The positive recombinant phages were isolated and charac- 
terized as above. 

Northern Blot Analysis. A mouse multi-tissue Northern blot 
(Qontech) containing 2 of poly(A) + RNA per sample 
isolated from various adult mouse tissues was; hybridized 
under stringent conditions with the ^-labeled probe SXT-5. . 
Hybridization was carried out as suggested in the manufac- 
turer's pn>tocol except that the final wash at 6 
SSC instead of 0.1 x SSC. The intactnessof the RNA samples 
on the blot was checked with the /J-actin probe provided whh 
it. The band intensities were scanned with a Biolroage 
densitometer (Miflipore). 

Sequence Analysis. Nucleotide and amino acid homology 
comparisons were carried out against the Gen&ank, EMBL, 
PIR, and SwisvProt databases at the National Center for 
Biotechnology Information with the blast network service 

(13) . The search for functional patterns of amino acid se- 
quences was carried out with the prosite database (14). 

RESULTS AND DISCUSSION 

Isolation of Mouse cDNA Clones Encoding the orl(XVm) 
Collagen Chain. A 500-bp cDNA, G2, that encodes the al 
chain of mouse type XTJI collagen (unpublished results) was 
used as a probe to screen an 11.5-<lay mouse embryo cDNA 
library. Five positive signals were identified among -'900,000 
clones. One of these, ME-1, contained a 2.3-kb cDNA insert 
that coded for a collagenous polypeptide. Rescreening of 
about 600,000 recombinants of the same library with ME-1 
resulted in the identification of 2. additional clones, SXT-1 
and SXT-5, with inserts of 0.6 kb and 2.8 kb, respectively. 
Together these 3 clones cover 2.8 kb of the corresponding 
mRNA sequence (Fig. 1). The nucleotide and amino acid 
sequences derived from them were not compatible with any 
of the previously characterized collagens I-XVUor any other 
reported collagenous sequence (1-10). It is thus proposed 
that the polypeptide encoded by the clones should be desig- 
nated the al chain of type XVIII collagen. ' 

Partial Nucleotide and Amino Add Sequences of the Mouse 
and Human <*l(XVm) CoQagdi Chains, The mouse clones 
encode an open reading frame of 928 aa preceded by 20 nt of 
5' untranslated sequence (Fig. 2). The other reading frames . 
contain multiple stop codons. The presumed translation-; 
initiation codon is encoded by nt 21-23. Sequences surround- 
ing the codon for methionine match well with the best- 
conserved nucleotides (underlined) of the proposed consen- 
sus sequence for initiation of translation , QCC(R)CCAlI£iG 
(15). The amino-tenninal end of the predicted polypeptide 
contains a hydrophobic sequence that clearly fulfills the . 
criteria for a signal peptide, and on comparison with other 
proteins this sequence was found to be highly homologous 
whh the signal peptide of decorin, the identity being 80% 
among the 10 residues preceding the proposed cleavage site 
for human decorin (data not shown ; for decorin sequence,, see 
ref. 16). Thus, comparisons with other proteins and predic- 
tion of the signal-peptide cleavage site by the method of von 
Herjne (17) led to the suggestion that the al(XVUI) collagen 
chain has a signal peptide of 25 aa. Positions -3 and -1 are 
occupied by serine and alanine, residues frequently found in 
these positions (17). Th presence of the signal peptide 
sugg sts that the polypeptide is secreted into the extracellular 
matrix. 

The putative signal peptide is followed by a 301-aa noncol- 
lagenous domain that contains the only cysteine residues of 
the portion of the polypeptide encoded by the clones de- 
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{Upper) The overlapping cDNA clooes ME-1, SXT-1, and SXT-5 
and the locations of the £coRI (£) and BamHl (B) restriction sites. 
The EcoKl site shown in parentheses represents a linker site intro- 
duced during cloning. {Lower) cDNA-derived polypeptide structure. 
The numbering of the noncollagenous (NC) and collagenous (COL) 
domains is shown below the polypeptide; and the lengths of these . 

.. domains in amino acids are given above the polypeptide. The 
numbering of the domains begins from the carboxyi end of .the 
polypeptide, based on carboxyl-termmal sequence characterized by 
Oh et al. (18). The dashed lines indicate that the clones do not cover 
the (*rboxyl-terminal end of the polypeptide. Thus, the COL2 
domain is expected to be >11 aa. Dark box, signal peptide; ATG, 
putative translation initiation codon; stippled boxes; noncohageoous 
sequences; open boxes, collagenous sequences; HIS and NGS, 
potential N-linked glycosylation sites; GSG, potential O-hnked gly- 
cosylation sites; C, cysteine; RGD, potential ceD attachment site; 

. tsp, thrombospoixiin homology area. 

scribed here (Fig. 2). The rest of the sequence consists of a 
602-aa primarily collagenous sequence (Fig. 2). The clones do 

. not fully cover the carboxyMerminal end of the predicted 
polypeptide, however, since the stop codon is lacking. A 
notable feature of the collagenous sequence is that it contains 
eight interruptions. The eight collagenous domains inter- 
spersed by the interruptions range in size from 21 to 122 aa, 
and five of the seven noncollagenous domains vary in size 
from 10 to 14 aa, while two are longer ones, of 23 and 24 aa. 
Furthermore, the four longest collagenous domains contain a . 
total of five short imperfections that are due to the lack of one 
residue of the collagenous Gly-Xaa-Yaa triplet. The colla- 
fcerfoos sequences arc rich in proline, as this amino acid 
residue represents 27% of all the residues in the Gly-Xaa-Yaa 
triplets. Fifty-eight percent of the prolines are in the Yaa 
position and, thus, are subject to 4-hydroxylation (3). The 
polypeptide structure is presented schematically, in Fig. 1, 
with the noncollagenous and collagenous domains numbered 
from the carboxyl-tertninal end. 

Oh et al. (18) have independently isolated cDNAs that also 
code for the polypeptide described here. The 5' sequences of 
their clones differ from the first 99 nt of our clones, which 
may indicate that the ol(XVUI) gene has alternative promot- 
ers or that its transcripts are subject to alternative splicing. 
As the clones by Oh et al (18) cover the carboxyMenninal 
end of the polypeptide, it can be estimated that our clones 
cover only part of the penultimate collagenous domain ard 
lack sequences corresponding to the last collagenous do- 
mains and the carbbxyl-terminal noncollagenous domain. 

Types XVm and XV Form a Subclass Within the Collagen 
Family. Since the total number of the residues in the inter- 

. ruptions is ID, 19% of the residues in the portion f .the 
al(XVnr) collagenous sequence described here are not lo- 
cated in the Gly-Xaa-Yaa repeats. The fact that it contains 
frequent interruptions in the collagenous sequence means 
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that type XVHI collagen resembles three other collagen 
chains: al(XV); al(XVI), and al(XVn). The al(XV) chain 
has a 577-aa collagenous domain with 8 interruptions con- 
tammg 33% of the cotogenous-domain residues (4), aJL(XVI) 
a 1244-aa collagenous domain with 9 interruptions containing 
15% of the collagenous residues (5), and alQCVU) an 846-aa 
collagenous domain with 12 interruptions hosting 36%of the 
residues (6). The human orl(XV) chain has been reported to 
COTsist of nine collagenous domains, termed here COL9- 
COL1 (numbered from carboxyi terminus to amino termi- 
nus), with sizes of 18, 114. 35, 45, 21, 2Q. 18, 55, and 15 aa, 
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respectively (4). The four extreme carboxyRenninal colla- 
genous domains fully covered by the mouse aKXVffl) clones 
are 42, 73, 33, and 21 aa [COL6-COL3 in Fig. 1, numbered 
from the carboxyi^tenninal end of the polypeptide as sug- 
gested by Oh et al (18)], being thus similar in size to the four 
underlined collagenous domains of the al(XV) chain. Closer 
comparison indicates that these four collagenous domains of 
the al(XVm) and al(XV) chains are homologous in th ir 
amino acid sequences (Fig. 3), this homology being most 
notable between the 71-residue C0L5 domain of al(XV) and 
the 73-residue C0L5 of alQCVIII), with 59% identity. The 
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collagenous domain of the al(XV) chain preceding the four 
underlined domains (see above) is clearly, different both in 
size, 35 aa, andinamin acid sequence from the c rrespond- 
ihgly located 83^aa COL7 domain of the al(XVm) chain. 
Interestingly, the next collagenous domain of the al(XV) 
chain, COL8, stands out again as being similar in size, 114 aa, 
t the 122-aaCOL8 domain of al(XVIlI). Alignment of these 
collagenous domains in the two chains shows identity mainly 
in sequences that involve Gly-Pro-Pro repeats at both ends of 
the domains (Fig. -3). Repeats of Gly-Pro-Pro triplets are 
a>mmonly found in collagen chains adjacent to noncollage- 
nous domains and thus do not necessarily, point to a close 

volurionary relationship between these chains. The al(XV) 
COL8 and al(XVHT) COLS domains nevertheless possess a 
short imperfection in identical locations (Fig. 3) 3 suggesting 
that these domains are indeed homologous. The homologous 
COL5 domains of the al(XV) and al(XVIU) chains also 
contain simQarfly located imperfections, suggesting that con- 
servation of the imperfections is functionally implicated (Fig. 
3)..* • • .. ' - 

The al(XVm) chain contains one more of the collagenous 
domains at the beginning of the collagenous sequence than 
the ol(XV) chain. The extreme amino-terminal collagenous 
domain of the al(XV) chain does not correspond- in either 
size or sequence to either of the extreme amino-terminal 
collagenous domains of the arl(XVHI) chain, COL9 and 
COL8. It is thus not possible to fully align the two homolo- 
gous polypeptides, indicating that they cannot represent 
different a chains of the same collagen type. TJk type XV 
collagen has hitherto been characterized via cDNA clones, 
aud its function is not known (4), nor has h been found to be . 
homologous with the FACIT subgroup of coliagens or any of . 
the other previously reported nonfibrillarcollagens. Thus the 
amino acid sequence homology between collagen types 
Xym and >TV indicates that they represent a subfamily 
witlun the heterogeneous family of coDagens. 

Throinboepoodh) Homology and Multiple Potential Grjrco- 
sylarion Sites In the al(XVill) Polypeptide. Homology 
searches against protein databanks showed the al(XYIU) 
polypeptide to be homologous to a large amino-terminal 
segment of thrombospondin (Fig. 4), a multifunctional gly- 
caprotew with affinity .for several molecules (24). This 
^200^aa nprjcoflagerious segment has previously been iden- 
tified in the amino terminus of collagen types V, XI, and IX 
and his been found to be embedded in the large noncollag- 
enous amino-terminal domain of collagen types XII and XIV 
(22, 23). Furthermore, a proline- and arginine-rich protein 
[PARP, which may represent a fragment of the al chain of 
type XI collagen (25)] has been found to contain this module 
(22). This sequence represents the amino»terminal beparin- 
bioding domain of thrombospondin (24). The positions 
thought to be involved in heparin binding are not, however, 



conserved in any of the previously described coliagens (22) or 
in the type XVIII collagen chain described here (Fig. 4); Thus 
the significance of this thrombospondin homology in the 
various collagen chains is unknown. 

A search for structural motifs in the al(XYIH) polypeptide 
sequence led to the identification f two putative sites for 
N-linked glycosylation, an Asn-De-Ser sequence in the NC11 
domain and an Asn-Gly-Ser sequence near a short interrup- 
tion in the COL8 domain (Figs. 1 and 2). Additional putative 
glycosylatiori sites were located in the NC9 and NG8 domains 
(Figs. 1 and 2) in the form of two sequences that conform to 
the consensus sequence [(Asp/Gro)-Xaa-Glu-Gly-Ser--Gly- 
Ser-Gly-Xaa-Leuj for Olinked glycosaminoglycan attach- 
ment in a number of proteins (26). Interestingly, these 
putative NC9 and NC8 glycosylation sites 'were identical in 
sequence for 6 aa, Asp-Met-Ghi-Gly-Ser-Gly. As this se- 
quence represents the only internal homology among the 
aKXVIH) chajn interruptions, the sequence conservation 
may provide further evidence for utilization of these two 
sequences in glycosaminoglycan attachment. Putative glyco- 
sylation sites that, conform less well to the consensus, se- 
quence also exist, particularly in the NC11 domain. The 
possibility of type XVIII collagen containing a glycosamino- 
glycan side chain is supported by recent findings indicating 
the existence of such side chains in several coliagens. More 
specifically, the FACIT coliagens IX, XII, and XIV have 
been shown to contain a glycc>saminoglycan side chain (27- 
29). Type XV collagen also contains multiple putative sites 
for both N- and Olinked glycosylation (4), further highlight- 
ing the similarity between collagen types XVIII and XV. 
Searches for other biologically significant sequence motifs 
revealed that the COL3 of al(XVm) contains one Arg-Gly- 
Asp sequence that may play a role in cell attachment (30). 
. This sequence is not found in the corresponding homologous 
collagenous domain in type XV, however. 

Restricted Tissue Distribution ofType XVHI CoDagea Tran- 
scripts in Mouse. When a Northern blot containing poly(A) + 
RNA isolated from mouse brain, heart, kidney, liver, lung, 
skeletal muscle, spleen, and testis was hybridized with the 
2.8-kb probe SXT-5, a clear hybridization signal was visible 
with lung and liver mRNA after only 3 hr of autoradiographic 
exposure (data not shown). With both tissues the probe 
hybridized to a major 4.3-kb transcript and a somewhat less 
abundant 5.3-kb transcript, whereas these bands were absent 
from the mRNAs isolated from the other tissues, even after 
a prolonged exposure (Fig. 5). The major 4.3-kb transcript 
comprised £3% and 74% of the type XVm collagen tran- 
scripts in the lung and liver tissue, respectively. Two faint 
bands of 3.8 kb and 4.7 kb, clearly differing in size from the 
Strong lung and liver signals, were seen in all samples except 
the heart and skeletal muscle RNAs (Fig. 5). It is possible that 
the 3.8* and 4.7-kb bands may be low-abundance alterna- 
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F10. 4. Thrombospondin homology in the ainino^tenninal noncoUagenous domain of the mouse orl(XVM) chain. The al(XVlD) sequence 
is aligned here with thrombospondin and the al(V) and ol(IX) coOagen chains, the munbering indicating the number of amino acid residues in 
each polypeptide (19-24). The homologies of the al(V) and ol(IX) collagen chains and certain other coDagens with throcnbospoDdin have been 
reported previously (22, 23). The conserved amino acid residues previously identified to be identical in thrombospondin and other matrix proteins 
(22) are indicated in bold type. The residues in the olQCVUI) sequence that are identical to one or more of the other polypeptides shown here 
are marked with stars. The identified amino-terminal heparin binding sites in thrombospondin (24) are marked with bars. 
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Fig. 5. Northern Wot analysis of al(XVUI) collagen mRNAs in 
mouse tissues. Each lane contained 2 /ig of poiyW* RNA from the 
adult mouse tissues indicated. The blot was hybridized to a mouse 
al(XVni) collagen clone. Autoradiography time whs 18 hr. 

lively spliced forms of the al(XVni) mRNA. Another ex- 
planation, however, may be thai they ait transcripts of a 
different gene. The same Wot was also hybridized with a 
0-actin probe to confirm the intactness of the mRNAs 
isolated from each tissue. Strong hybridization signals were 
obtained from each tissue, with no sign of degradation of the 
RNAs (data not shown), thus excluding the possibility that 
the lack of al(XVlH) transcripts in most tissues could be an 
artifact. 

CoBdnsioos. The mouse clones described here code for a . 
unique polypeptide, designated as the al chain of type XVIII 
collagen. Altogether, 928 aa were determined, including a 
signal peptide of 25 aa, an aniino-terminal noncollagenous 
domain of 301 aa,. and a 602-aa stretch of a collagenous 
region. Type XVIII collagen was found to resemble type XV 
collagen (4) in containing multiple interruptions and imper- 
fections in the collagenous sequences and in having several 
. sequences, that may serve as sites for N- and Olinked 
glycosyiation. Several of the variable-length collagenous 
domains of the two types were found to be similar in both size 
and sequence, leading to the suggestion that collagen types 
XVIII and XV form a subclass within the large family of 
collagenous protein^. 

The amino-teriminal noDcoUagenous domain of type XVIII 
collagen contained an ~200-aa sequence that was homolo- 
gous to thrombospondin. It has been reported that collagen 
types V, IX, XI, XII, and XIV contain this sequence module 
(22, 24), and we found it to be the only homology between 
type XVIII and the other collagens except for type XV. Thus, 
molecules belonging to different subclasses of collagens— 
i.e., the fibrillar and FACTT collagens, and also some other 
collages—share this sequence module, although its func- 
tional significance in collagens is not known. Of interest is 
that the two cysteine residues that are conserved within this 
sequence in all collagens are known to form a disulfide bond 
in a proline- and arginine-rich protein, PARP (25). It thus 
seems likely that the only two cysteines found in the amino- '■ 
terminal noncollagenous domain of the al(XVm) chain also 
take part in disulfide bond formation. 

Type XVIII collagen mRNAs had a striking tissue distri- 
bution, as demonstrated by the clear Northern signal in liver 
and lung RNA but not in brain, heart, kidney, skeletal 
muscle, spleen, or testis RNA. Further research will be 
required, however, to obtain a complete picture of the pattern 



of expression of this collagen. The present finding of marked 
amounts of mRNAs only in liver and lung among the eight 
mouse tissues studied already justifies the suggesti n that 
type XVIII collagen mRNAs have a distinct tissue distribu- 
ti n that is not similar to that of any of the previously 
described collagens. 
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